haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-10 17:17:06 +02:00

Author	SHA1	Message	Date
Christopher Faulet	b1bb069c15	MINOR: tcpcheck: Don't handle anymore in-progress connect rules in tcpcheck_main The special handling of in-progress connect rules at the begining of tcpcheck_main() function can be removed. Instead, at the begining of the tcpcheck_eval_connect() function, we test is there is already an existing connection. In this case, it means we are waiting for a connection establishment. In addition, before evaluating a new connect rule, we take care to release any previous connection.	2020-11-27 10:29:41 +01:00
Christopher Faulet	b381a505c1	BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool Historically, the input and output buffers of a check are allocated by hand during the startup, with a specific size (not necessarily the same than other buffers). But since the recent refactoring of the checks to rely exclusively on the tcp-checks and to use the underlying mux layer, this part is totally buggy. Indeed, because these buffers are now passed to a mux, they maybe be swapped if a zero-copy is possible. In fact, for now it is only possible in h2_rcv_buf(). Thus the bug concretely only exists if a h2 health-check is performed. But, it is a latent bug for other muxes. Another problem is the size of these buffers. because it may differ for the other buffer size, it might be source of bugs. Finally, for configurations with hundreds of thousands of servers, having 2 buffers per check always allocated may be an issue. To fix the bug, we now allocate these buffers when required using the buffer pool. Thus not-running checks don't waste memory and muxes may swap them if possible. The only drawback is the check buffers have now always the same size than buffers used by the streams. This deprecates indirectly the "tune.chksize" global option. In addition, the http-check regtest have been update to perform some h2 health-checks. Many thanks to @VigneshSP94 for its help on this bug. This patch should solve the issue #936. It relies on the commit "MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main". Both must be backport as far as 2.2. bla	2020-11-27 10:29:41 +01:00
Christopher Faulet	39066c2738	MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main The special handling of in-progress send rules at the begining of tcpcheck_main() function can be removed. Instead, at the begining of the tcpcheck_eval_send() function, we test is there is some data in the output buffer. In this case, it means we are evaluating an unfinished send rule and we can jump to the sending part, skipping the formatting part. This patch is mandatory for a major fix on the checks and must be backported as far as 2.2.	2020-11-27 10:08:21 +01:00
Christopher Faulet	1faf18ae39	BUG/MINOR: tcpcheck: Don't forget to reset tcp-check flags on new kind of check When a new kind of check is found during the parsing of a proxy section (via an option directive), we must reset tcpcheck flags for this proxy. It is mandatory to not inherit some flags from a previously declared check (for instance in the default section). This patch must be backported as far as 2.2.	2020-11-27 10:08:18 +01:00
Willy Tarreau	5a7d6ebf2c	MINOR: fd/threads: silence a build warning with threads disabled Building with gcc-9.3.0 without threads may result in this warning: In file included from include/haproxy/api-t.h:36, from include/haproxy/api.h:33, from src/fd.c:90: src/fd.c: In function 'updt_fd_polling': include/haproxy/fd.h:507:11: warning: array subscript 63 is above array bounds of 'int[1]' [-Warray-bounds] 507 \| DISGUISE(write(poller_wr_pipe[tid], &c, 1)); include/haproxy/compiler.h:92:41: note: in definition of macro 'DISGUISE' 92 \| #define DISGUISE(v) ({ typeof(v) __v = (v); ALREADY_CHECKED(__v); __v; }) \| ^ src/fd.c:113:5: note: while referencing 'poller_wr_pipe' 113 \| int poller_wr_pipe[MAX_THREADS]; // Pipe to wake the threads \| ^~~~~~~~~~~~~~ gcc is wrong but this time it cannot be blamed because it doesn't know that the FD's thread_mask always has at least one bit set. Let's add the test for all_threads_mask there. It will also remove that test and drop the else block.	2020-11-26 22:28:41 +01:00
Willy Tarreau	345ebcfc01	BUG/MAJOR: peers: fix partial message decoding Another bug in the peers message parser was uncovered by last commit `1dfd4f106` ("BUG/MEDIUM: peers: fix decoding of multi-byte length in stick-table messages"): the function return on incomplete message does not check if the channel has a pending close before deciding to return 0. It did not hurt previously because the loop calling co_getblk() once per character would have depleted the buffer and hit the end, causing <0 to be returned and matching the condition. But now that we process at once what is available this cannot be relied on anymore and it's now clearly visible that the final check is missing. What happens when this strikes is that if a peer connection breaks in the middle of a message, the function will return 0 (missing data) but the caller doesn't check for the closed buffer, subscribes to reads, and the applet handler is immediately called again since some data are still available. This is detected by the loop prevention and the process dies complaining that an appctx is spinning. This patch simply adds the check for closed channel. It must be backported to the same versions as the fix above.	2020-11-26 17:12:47 +01:00
Tim Duesterhus	23b2945c1c	BUG/CRITICAL: cache: Fix trivial crash by sending accept-encoding header Since commit `3d08236cb3` HAProxy can be trivially crashed remotely by sending an `accept-encoding` HTTP request header that contains 16 commas. This is because the `values` array in `accept_encoding_normalizer` accepts only 16 entries and it is not verified whether the end is reached during looping. Fix this issue by checking the length. This patch also simplifies the ist processing in the loop, because it manually calculated offsets and lengths, when the ist API exposes perfectly safe functions to advance and truncate ists. I wonder whether the accept_encoding_normalizer function is able to re-use some existing function for parsing headers that may contain lists of values. I'll leave this evaluation up to someone else, only patching the obvious crash. This commit is 2.4-dev specific and was merged just a few hours ago. No backport needed.	2020-11-25 10:23:00 +01:00
Remi Tricot-Le Breton	754b2428d3	MINOR: cache: Add a process-vary option that can enable/disable Vary processing The cache section's process-vary option takes a 0 or 1 value to disable or enable the vary processing. When disabled, a response containing such a header will never be cached. When enabled, we will calculate a preliminary hash for a subset of request headers on all the incoming requests (which might come with a cpu cost) which will be used to build a secondary key for a given request (see RFC 7234#4.1). The default value is 0 (disabled).	2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton	1785f3dd96	MEDIUM: cache: Add the Vary header support Calculate a preliminary secondary key for every request we see so that we can have a real secondary key if the response is cacheable and contains a manageable Vary header. The cache's ebtree is now allowed to have multiple entries with the same primary key. Two of those entries will be distinguished thanks to secondary keys stored in the cache_entry (based on hashes of a subset of their headers). When looking for an entry in the cache (cache_use), we still use the primary key (built the same way as before), but in case of match, we also need to check if the entry has a vary signature. If it has one, we need to perform an extra check based on the newly built secondary key. We will only be able to forge a response out of the cache if both the primary and secondary keys match with one of our entries. Otherwise the request will be forwarder to the server.	2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton	3d08236cb3	MINOR: cache: Prepare helper functions for Vary support The Vary functionality is based on a secondary key that needs to be calculated for every request to which a server answers with a Vary header. The Vary header, which can only be found in server responses, determines which headers of the request need to be taken into account in the secondary key. Since we do not want to have to store all the headers of the request until we have the response, we will pre-calculate as many sub-hashes as there are headers that we want to manage in a Vary context. We will only focus on a subset of headers which are likely to be mentioned in a Vary response (accept-encoding and referer for now). Every managed header will have its own normalization function which is in charge of transforming the header value into a core representation, more robust to insignificant changes that could exist between multiple clients. For instance, two accept-encoding values mentioning the same encodings but in different orders should give the same hash. This patch adds a function that parses a Vary header value and checks if all the values belong to our supported subset. It also adds the normalization functions for our two headers, as well as utility functions that can prebuild a secondary key for a given request and transform it into an actual secondary key after the vary signature is determined from the response.	2020-11-24 16:52:57 +01:00
Christopher Faulet	401e6dbff3	BUG/MAJOR: filters: Always keep all offsets up to date during data filtering When at least one data filter is registered on a channel, the offsets of all filters must be kept up to date. For data filters but also for others. It is safer to do it in that way. Indirectly, this patch fixes 2 hidden bugs revealed by the commit `22fca1f2c` ("BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering"). The first one, the worst of both, happens at the end of http filtering when at least one data filtered is registered on the channel. We call the http_end() callback function on the filters, when defined, to finish the http filtering. But it is performed for all filters. Before the commit `22fca1f2c`, the only risk was to call the http_end() callback function unexpectedly on a filter. Now, we may have an overflow on the offset variable, used at the end to forward all filtered data. Of course, from the moment we forward an arbitrary huge amount of data, all kinds of bad things may happen. So offset computation is performed for all filters and http_end() callback function is called only for data filters. The other one happens when a data filter alter the data of a channel, it must update the offsets of all previous filters. But the offset of non-data filters must be up to date, otherwise, here too we may have an integer overflow. Another way to fix these bugs is to always ignore non-data filters from the offsets computation. But this patch is safer and probably easier to maintain. This patch must be backported in all versions where the above commit is. So as far as 2.0.	2020-11-24 14:17:32 +01:00
Maciej Zdeb	6dee9969b9	BUG/MEDIUM: http_act: Restore init of log-format list Restore init of log-format list in parse_http_del_header which was accidently deleted by commit `ebdd4c55da` (implementation of different header matching methods for http-request/response del-header). This is related to GitHub issue #909	2020-11-24 10:33:46 +01:00
Ilya Shipitsin	d9a16dc0f2	BUILD: SSL: add BoringSSL guarding to "RAND_keep_random_devices_open" "RAND_keep_random_devices_open" is OpenSSL specific, does not present in other OpenSSL variants like LibreSSL or BoringSSL. BoringSSL recently "updated" its internal openssl version to 1.1.1, we temporarily set it back to 1.1.0, as we are going to remove that hack, let us add proper guarding.	2020-11-24 09:54:44 +01:00
Julien Pivotto	2de240a676	MINOR: stream: Add level 7 retries on http error 401, 403 Level-7 retries are only possible with a restricted number of HTTP return codes. While it is usually not safe to retry on 401 and 403, I came up with an authentication backend which was not synchronizing authentication of users. While not perfect, being allowed to also retry on those return codes is really helpful and acts as a hotfix until we can fix the backend. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-11-23 09:33:14 +01:00
Tim Duesterhus	c8d19702f4	BUILD: Show the value of DEBUG= in haproxy -vv Previously this was not visible after building.	2020-11-21 18:27:33 +01:00
Maciej Zdeb	ebdd4c55da	MINOR: http_act: Add -m flag for del-header name matching method This patch adds -m flag which allows to specify header name matching method when deleting headers from http request/response. Currently beg, end, sub, str and reg are supported. This is related to GitHub issue #909	2020-11-21 15:54:30 +01:00
Maciej Zdeb	302b9f8d7a	BUG/MINOR: http_htx: Fix searching headers by substring Function __http_find_header is used to search headers by name using specified matching method. Matching by substring returned unexpected results due to wrong length of substring supplied to strnistr function. Fixed also the boolean condition by inverting it, as we're interested in headers that contains the substring. This patch should be backported as far as 2.2	2020-11-21 15:54:26 +01:00
Willy Tarreau	3aab17bd56	BUG/MAJOR: connection: reset conn->owner when detaching from session list Baptiste reported a new crash affecting 2.3 which can be triggered when using H2 on the backend, with http-reuse always and with a tens of clients doing close only. There are a few combined cases which cause this to happen, but each time the issue is the same, an already freed session is dereferenced in session_unown_conn(). Two cases were identified to cause this: - a connection referencing a session as its owner, which is detached from the session's list and is destroyed after this session ends. The test on conn->owner before calling session_unown_conn() is not sufficent as the pointer is not null but is not valid anymore. - a connection that never goes idle and that gets killed form the mux, where session_free() is called first, then conn_free() calls session_unown_conn() which scans the just freed session for older connections. This one is only triggered with DEBUG_UAF The reason for this session to be present here is that it's needed during the connection setup, to be passed to conn_install_mux_be() to mux->init() as the owning session, but it's never deleted aftrewards. Furthermore, even conn_session_free() doesn't delete this pointer after freeing the session that lies there. Both do definitely result in a use-after-free that's more easily triggered under DEBUG_UAF. This patch makes sure that the owner is always deleted after detaching or killing the session. However it is currently not possible to clear the owner right after a synchronous init because the proxy protocol apparently needs it (a reg test checks this), and if we leave it past the connection setup with the session not attached anywhere, it's hard to catch the right moment to detach it. This means that the session may remain in conn->owner as long as the connection has never been added to nor removed from the session's idle list. Given that this patch needs to remain simple enough to be backported, instead it adds a workaround in session_unown_conn() to detect that the element is already not attached anywhere. This fix absolutely requires previous patch "CLEANUP: connection: do not use conn->owner when the session is known" otherwise the situation will be even worse, as some places used to rely on conn->owner instead of the session. The fix could theorically be backported as far as 1.8. However, the code in this area has significantly changed along versions and there are more risks of breaking working stuff than fixing real issues there. The issue was really woken up in two steps during 2.3-dev when slightly reworking the idle conns with commit `08016ab82` ("MEDIUM: connection: Add private connections synchronously in session server list") and when adding support for storing used H2 connections in the session and adding the necessary call to session_unown_conn() in the muxes. But the same test managed to crash 2.2 when built in DEBUG_UAF and patched like this, proving that we used to already leave dangling pointers behind us: \| diff --git a/include/haproxy/connection.h b/include/haproxy/connection.h \| index f8f235c1a..dd30b5f80 100644 \| --- a/include/haproxy/connection.h \| +++ b/include/haproxy/connection.h \| @@ -458,6 +458,10 @@ static inline void conn_free(struct connection conn) \| sess->idle_conns--; \| session_unown_conn(sess, conn); \| } \| + else { \| + struct session sess = conn->owner; \| + BUG_ON(sess && sess->origin != &conn->obj_type); \| + } \| \| sockaddr_free(&conn->src); \| sockaddr_free(&conn->dst); It's uncertain whether an existing code path there can lead to dereferencing conn->owner when it's bad, though certain suspicious memory corruption bugs make one think it's a likely candidate. The patch should not be hard to adapt there. Backports to 2.1 and older are left to the appreciation of the person doing the backport. A reproducer consists in this: global nbthread 1 listen l bind :9000 mode http http-reuse always server s 127.0.0.1:8999 proto h2 frontend f bind :8999 proto h2 mode http http-request return status 200 Then this will make it crash within 2-3 seconds: $ h1load -e -r 1 -c 10 http://0:9000/ If it does not, it might be that DEBUG_UAF was not used (it's harder then) and it might be useful to restart.	2020-11-21 15:29:22 +01:00
Willy Tarreau	38b4d2eb22	CLEANUP: connection: do not use conn->owner when the session is known At a few places we used to rely on conn->owner to retrieve the session while the session is already known. This is not correct because at some of these points the reason the connection's owner was still the session (instead of NULL) is a mistake. At one place a comparison is even made between the session and conn->owner assuming it's valid without checking if it's NULL. Let's clean this up to use the session all the time. Note that this will be needed for a forthcoming fix and will have to be backported.	2020-11-21 15:29:22 +01:00
Ilya Shipitsin	f34ed0b74c	BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES HAVE_SSL_CTX_SET_CIPHERSUITES is newly defined macro set in openssl-compat.h, which helps to identify ssl libs (currently OpenSSL-1.1.1 only) that supports TLS13 cipersuites manipulation on TLS13 context	2020-11-21 11:04:36 +01:00
William Lallemand	77e1c6fb0a	BUG/MEDIUM: ssl/crt-list: fix error when no file found When a file from a crt-list was not found, this one was ignored silently letting HAProxy starts without it. This bug was introduced by `47da821` ("MEDIUM: ssl: emulates the multi-cert bundles in the crtlist"). This commit adds a found variable which is checked once we tried every bundle combination so we can exits with an error if none were found. Must be backported in 2.3.	2020-11-20 18:38:56 +01:00
William Lallemand	7340457158	BUG/MINOR: ssl/crt-list: load bundle in crt-list only if activated Don't try to load a bundle from a crt-list if the bundle support was disabled with ssl-load-extra-files. Must be backported to 2.3.	2020-11-20 18:38:56 +01:00
William Lallemand	06ce84a100	BUG/MEDIUM: ssl: error when no certificate are found When a non-existing file was specified in the configuration, haproxy does not exits with an error which is not normal. This bug was introduced by `dfa93be` ("MEDIUM: ssl: emulate multi-cert bundles loading in standard loading") which does nothing if the stat failed. This patch introduce a "found" variable which is checked at the end of the function so we exit with an error if no find were found. Must be backported to 2.3.	2020-11-20 18:38:56 +01:00
William Lallemand	86c2dd60f1	BUG/MEDIUM: ssl/crt-list: bundle support broken in crt-list In issue #970 it was reported that the bundle loading does not work anymore with crt-list. This bug was introduced by `47da821` ("MEDIUM: ssl: emulates the multi-cert bundles in the crtlist") which incorrectly uses "path" instead of "crt_path" in the name resolution. Must be backported to 2.3.	2020-11-20 18:38:51 +01:00
Christopher Faulet	aab1b67383	BUG/MEDIUM: http-ana: Don't eval http-after-response ruleset on empty messages It is not possible on response comming from a server, but an errorfile may be empty. In this case, the http-after-response ruleset must not be evaluated because it is totally unexpected to manipulate headers on an empty HTX message. This patch must be backported everywhere the http-after-response rules are supported, i.e as far as 2.2.	2020-11-20 09:43:31 +01:00
Ilya Shipitsin	bdec3ba796	BUILD: ssl: use SSL_MODE_ASYNC macro instead of OPENSSL_VERSION	2020-11-19 19:59:32 +01:00
William Lallemand	f69cd68737	BUG/MINOR: ssl: segv on startup when AKID but no keyid In bug #959 it was reported that haproxy segfault on startup when trying to load a certifcate which use the X509v3 AKID extension but without the keyid field. This field is not mandatory and could be replaced by the serial or the DirName. For example: X509v3 extensions: X509v3 Basic Constraints: CA:FALSE X509v3 Subject Key Identifier: 42:7D:5F:6C:3E:0D:B7:2C:FD:6A:8A:32:C6:C6:B9:90:05:D1:B2:9B X509v3 Authority Key Identifier: DirName:/O=HAProxy Technologies/CN=HAProxy Test Intermediate CA serial:F2:AB:C1:41:9F:AB:45:8E:86:23:AD:C5:54:ED:DF:FA This bug was introduced by 70df7b ("MINOR: ssl: add "issuers-chain-path" directive"). This patch must be backported as far as 2.2.	2020-11-19 16:24:13 +01:00
William Dauchy	f63704488e	MEDIUM: cli/ssl: configure ssl on server at runtime in the context of a progressive backend migration, we want to be able to activate SSL on outgoing connections to the server at runtime without reloading. This patch adds a `set server ssl` command; in order to allow that: - add `srv_use_ssl` to `show servers state` command for compatibility, also update associated parsing - when using default-server ssl setting, and `no-ssl` on server line, init SSL ctx without activating it - when triggering ssl API, de/activate SSL connections as requested - clean ongoing connections as it is done for addr/port changes, without checking prior server state example config: backend be_foo default-server ssl server srv0 127.0.0.1:6011 weight 1 no-ssl show servers state: 5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - -1 where srv0 can switch to ssl later during the runtime: set server be_foo/srv0 ssl on 5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - 1 Also update existing tests and create a new one. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-11-18 17:22:28 +01:00
William Dauchy	fc52f524b0	MINOR: ssl: create common ssl_ctx init a common init for ssl_ctx will be later usable in other functions in order to support hot enable of ssl during runtime. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-11-18 17:22:28 +01:00
Amaury Denoyelle	034c162b9b	MEDIUM: stats: add counters for failed handshake Report on ssl stats the total number of handshakes terminated in a failure.	2020-11-18 16:10:42 +01:00
Amaury Denoyelle	f70b7db825	MINOR: ssl: remove client hello counters Remove the ssl client hello received counter. This counter is not meaningful and was only implemented on the fronted.	2020-11-18 16:10:42 +01:00
Christopher Faulet	47d9a4e870	MINOR: flt-trace: Use a bitfield for the trace options Instead of using a integer for each option, we now use a bitfield. Each option is represented as a flag now.	2020-11-17 11:34:36 +01:00
Christopher Faulet	96a577acae	MINOR: flt-trace: Add an option to inhibits trace messages The 'quiet' option may be set to inibits the trace messages. The trace filter is a bit verbose. This option may be used to not display the messages.	2020-11-17 11:34:36 +01:00
Christopher Faulet	c41d8bd65a	CLEANUP: flt-trace: Remove unused random-parsing option This option was only used by the legacy HTTP mode. In HTX, it is not used. So it can be removed.	2020-11-17 11:34:30 +01:00
Christopher Faulet	63c69a9b4e	BUG/MINOR: http-ana: Don't wait for the body of CONNECT requests CONNECT requests are bodyless messages but with no EOM blocks. Thus, conditions to stop waiting for the message payload are not suited to this kind of messages. Indeed, the message finishes on an EOH block. But the tunnel mode at the stream level is only set in HTTP_XFER_BODY analyser. So, the stream is blocked, waiting for a body that does not exist till a timeout expires. To fix this bug, we just stop waiting for a body for CONNECT requests. Another solution is to rely on HTX_SL_F_BODYLESS/HTTP_MSGF_BODYLESS flags. But this one is less intrusive. This message must be backported as far as 2.0. For the 2.0, only the HTX part must be fixed.	2020-11-17 10:03:12 +01:00
Christopher Faulet	22fca1f2c8	BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering When http filtering ends, if there are some filtered data not forwarded yet, we forward them, in flt_http_end(). Most of time, this doesn't happen, except when a tunnel is established using a CONNECT. In this case, there is not EOM on the request and there is no body. Thus the headers are never forwarded, blocking the stream. This patch must be backported as far as 2.0. Prior versions don't suffer of this bug because there is no HTX support. On the 2.0, the change is only applicable on HTX streams. A special test must be performed to make sure.	2020-11-17 09:59:35 +01:00
Eric Salama	9139ec34ed	MINOR: cfgparse: tighten the scope of newnameserver variable, free it on error. This should fix issue GH #931. Also remove a misleading comment. This commit can be backported as far as 1.9	2020-11-13 16:26:10 +01:00
Christopher Faulet	fc633b6eff	CLEANUP: config: Return ERR_NONE from config callbacks instead of 0 Return ERR_NONE instead of 0 on success for all config callbacks that should return ERR_* codes. There is no change because ERR_NONE is a macro equals to 0. But this makes the return value more explicit.	2020-11-13 16:26:10 +01:00
Christopher Faulet	5214099233	MINOR: config/mux-h2: Return ERR_ flags from init_h2() instead of a status post-check function callbacks must return ERR_* flags. Thus, init_h2() is fixed to return ERR_NONE on success or (ERR_ALERT\|ERR_FATAL) on error. This patch may be backported as far as 2.2.	2020-11-13 16:26:10 +01:00
Christopher Faulet	83fefbcdff	MINOR: init: Fix the prototype for per-thread free callbacks Functions registered to release memory per-thread have no return value. But the registering function and the function pointer in per_thread_free_fct structure specify it should return an integer. This patch fixes it. This patch may be backported as far as 2.0.	2020-11-13 16:26:10 +01:00
Christopher Faulet	c751b4508d	BUG/MINOR: tcpcheck: Don't warn on unused rules if check option is after When tcp-check or http-check rules are used, if the corresponding check option (option tcp-check and option httpchk) is declared after the ruleset, a warning is emitted about an unused check ruleset while there is no problem in reality. This patch must be backported as far as 2.2.	2020-11-13 16:26:10 +01:00
Christopher Faulet	c7ba91039a	MINOR: spoe: Don't close connection in sync mode on processing timeout In sync mode, if an applet receives a ack while the processing delay has already expired, there is not frame waiting for this ack. But there is no reason to close the connection in this case. The ack may be ignored and the connection may be reused to process another frame. The only reason to trigger an error and close the connection is when the wrong ack is received while there is still a frame waiting for its ack. In sync mode, this should never happen. This patch may be backported in all versions supporting the SPOE.	2020-11-13 16:26:10 +01:00
Christopher Faulet	cf181c76e3	BUG/MAJOR: spoe: Be sure to remove all references on a released spoe applet When a SPOE applet is used to send a frame, a reference on this applet is saved in the spoe context of the offladed stream. But, if the applet is released before receving the corresponding ack, we must be sure to remove this reference. This was performed for fragmented frames only. But it must also be performed for a spoe contexts in the applet waiting_queue and in the thread waiting_queue (used in async mode). This bug leads to a memory corruption when an offloaded stream try to update the state of a released applet because it still have a reference on it. There are many ways to trigger this bug. The easiest is probably during reloads. On the old process, all applets are woken up to be released ASAP. Many thanks to Maciej Zdeb to report the bug and to work on it for 2 months. Without his help, it would have been much more difficult to fix the bug. It is always a huge pleasure to see how some users are enthousiast and helpful. Thanks again Maciej ! This patch must be backported to all versions where the spoe is supported (>= 1.7).	2020-11-13 16:26:10 +01:00
Christopher Faulet	3005d28eb8	BUG/MINOR: http-htx: Handle warnings when parsing http-error and http-errors First of all, this patch is tagged as a bug. But in fact, it only fixes a bug in the 2.2. On the 2.3 and above, it only add the ability to display warnings, when an http-error directive is parsed from a proxy section and when an errorfile directive is parsed from a http-errors section. But on the 2.2, it make sure to display the warning emitted on a content-length mismatch when an errorfile is parsed. The following is only applicable to the 2.2. commit "BUG/MINOR: http-htx: Just warn if payload of an errorfile doesn't match the C-L" (which is only present in 2.2, 2.1 and 2.0 trees, i.e see commit 7bf3d81d3cf4b9f4587 in 2.2 tree), is changing the behavior of `http_str_to_htx` function. It may now emit warnings. And, it is the caller responsibility to display it. But the warning is missing when an 'http-error' directive is parsed from a proxy section. It is also missing when an 'errorfile' directive is parsed from a http-errors section. This bug only exists on the 2.2. On earlier versions, these directives are not supported and on later ones, an error is triggered instead of a warning. Thanks to William Dauchy that spotted the bug. This patch must be backported as far as 2.2.	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	90eb93f792	MINOR: check: report error on incompatible connect proto Report an error when using an explicit proto for a connect rule with non-compatible mode in regards with the selected check type (tcp-check vs http-check).	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	7c14890183	MINOR: check: report error on incompatible proto If the check mux has been explicitly defined but is incompatible with the selected check type (tcp-check vs http-check), report a warning and prevent haproxy startup.	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	0519bd4d04	BUG/MEDIUM: check: reuse srv proto only if using same mode Only reuse the mux from server if the check is using the same mode. For example, this prevents a tcp-check on a h2 server to select the h2 multiplexer instead of passthrough. This bug was introduced by the following commit : BUG/MEDIUM: checks: Use the mux protocol specified on the server line It must be backported up to 2.2. Fixes github issue #945.	2020-11-13 16:26:10 +01:00
Christopher Faulet	97fc8da264	BUG/MINOR: http-fetch: Fix calls w/o parentheses of the cookie sample fetches req.cook, req.cook_val, req.cook_cnt and and their response counterparts may be called without cookie name. In this case, empty parentheses may be used, or no parentheses at all. In both, the result must be the same. But only the first one works. The second one always returns a failure. This patch fixes this bug. Note that on old versions (< 2.2), both cases fail. This patch must be backported in all stable versions.	2020-11-13 16:26:10 +01:00
Maciej Zdeb	dea7c209f8	BUG/MINOR: http-fetch: Extract cookie value even when no cookie name HTTP sample fetches dealing with the cookies (req/res.cook, req/res.cook_val and req/res.cook_cnt) must be prepared to be called without cookie name. For the first two, the first cookie value is returned, regardless its name. For the last one, all cookies are counted. To do so, http_extract_cookie_value() may now be called with no cookie name (cookie_name_l set to 0). In this case, the matching on the cookie name is ignored and the first value found is returned. Note this patch also fixes matching on cookie values in ACLs. This should be backported in all stable versions.	2020-11-13 16:26:10 +01:00
Willy Tarreau	1dfd4f106f	BUG/MEDIUM: peers: fix decoding of multi-byte length in stick-table messages There is a bug in peer_recv_msg() due to an incorrect cast when trying to decode the varint length of a stick-table message, causing lengths comprised between 128 and 255 to consume one extra byte, ending in protocol errors. The root cause of this is that peer_recv_msg() tries hard to reimplement all the parsing and control that is already done in intdecode() just to measure the length before calling it. And it got it wrong. Let's just get rid of this unneeded code duplication and solely rely on intdecode() instead. The bug was introduced in 2.0 as part of a cleanup pass on this code with commit `95203f218` ("MINOR: peers: Move high level receive code to reduce the size of I/O handler."), so this patch must be backported to 2.0. Thanks to Yves Lafon for reporting the problem.	2020-11-13 15:21:50 +01:00
Fr�d�ric L�caille	ea875e62e6	BUG/MINOR: peers: Missing TX cache entries reset. The TX part of a cache for a dictionary is made of an reserved array of ebtree nodes which are pointers to dictionary entries. So when we flush the TX part of such a cache, we must not only remove these nodes to dictionary entries from their ebtree. We must also reset their values. Furthermore, the LRU key and the last lookup result must also be reset.	2020-11-13 06:04:18 +01:00
Fr�d�ric L�caille	f9e51beec1	BUG/MINOR: peers: Do not ignore a protocol error for dictionary entries. If we could not decode the ID of a dictionary entry from a peer update message, we must inform the remote peer about such an error as this is done for any other decoding error.	2020-11-13 06:04:08 +01:00
Fr�d�ric L�caille	d865935f32	MINOR: peers: Add traces to peer_treat_updatemsg(). Add minimalistic traces for peers with only one event to diagnose potential issues when decode peer update messages.	2020-11-12 17:38:49 +01:00
Amaury Denoyelle	7f8f6cb926	BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one Define a per-thread counters allocated with the greatest size of any stat module counters. This variable is named trash_counters. When using a proxy without allocated counters, return the trash counters from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent segfault. This is useful for all the proxies used internally and not belonging to the global proxy list. As these objects does not appears on the stat report, it does not matter to use the dummy counters. For this fix to be functional, the extra counters are explicitly initialized to NULL on proxy/server/listener init functions. Most notably, the crash has already been detected with the following vtc: - reg-tests/lua/txn_get_priv.vtc - reg-tests/peers/tls_basic_sync.vtc - reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc There is probably other parts that may be impacted (SPOE for example). This bug was introduced in the current release and do not need to be backported. The faulty commits are "MINOR: ssl: count client hello for stats" and "MINOR: ssl: add counters for ssl sessions".	2020-11-12 15:16:05 +01:00
Amaury Denoyelle	a2a6899bee	BUG/MINOR: stats: free dynamically stats fields/lines on shutdown Register a new function on POST DEINIT to free stats fields/lines for each domain. This patch does not fix a critical bug but may be backported to 2.3.	2020-11-12 15:16:05 +01:00
Remi Tricot-Le Breton	cc9bf2e5fe	MEDIUM: cache: Change caching conditions Do not cache responses that do not have an explicit expiration time (s-maxage or max-age Cache-Control directives or Expires header) or a validator (ETag or Last-Modified headers) anymore, as suggested in RFC 7234#3. The TX_FLAG_IGNORE flag is used instead of the TX_FLAG_CACHEABLE so as not to change the behavior of the checkcache option.	2020-11-12 11:22:05 +01:00
Thierry Fournier	91dc0c0d8f	BUG/MINOR: lua: set buffer size during map lookups This size is used by some pattern matching to determine if there is sufficient room in the buffer to add final \0 if necessary. If the size is not set, the conditions use uninitialized value. Note: it seems this bug can't cause a crash. Should be backported until 2.2 (at least)	2020-11-11 10:43:21 +01:00
Thierry Fournier	a68affeaa9	BUG/MINOR: pattern: a sample marked as const could be written The functions add final 0 to string if the final 0 is not set, but don't check the flag CONST. This patch duplicates the strings if the final zero is not set and the string is CONST. Should be backported until 2.2 (at least)	2020-11-11 10:43:15 +01:00
William Lallemand	50c03aac04	BUG/MEDIUM: ssl/crt-list: correctly insert crt-list line if crt already loaded In issue #940, it was reported that the crt-list does not work correctly anymore. Indeed when inserting a crt-list line which use a certificate previously seen in the crt-list, this one won't be inserted in the SNI list and will be silently ignored. This bug was introduced by commit `47da821` "MEDIUM: ssl: emulates the multi-cert bundles in the crtlist". This patch also includes a reg-test which tests this issue. This bugfix must be backported in 2.3.	2020-11-06 16:39:39 +01:00
Willy Tarreau	431a12cafe	BUILD: http-htx: fix build warning regarding long type in printf Commit `a66adf41e` ("MINOR: http-htx: Add understandable errors for the errorfiles parsing") added a warning when loading malformed error files, but this warning may trigger another build warning due to the %lu format used. Let's simply cast it for output since it's just used for end user output. This must be backported to 2.0 like the commit above.	2020-11-06 14:24:02 +01:00
Willy Tarreau	4299528390	BUILD: ssl: silence build warning on uninitialised counters Since commit `d0447a7c3` ("MINOR: ssl: add counters for ssl sessions"), gcc 9+ complains about this: CC src/ssl_sock.o src/ssl_sock.c: In function 'ssl_sock_io_cb': src/ssl_sock.c:5416:3: warning: 'counters_px' may be used uninitialized in this function [-Wmaybe-uninitialized] 5416 \| ++counters_px->reused_sess; \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_sock.c:5133:23: note: 'counters_px' was declared here 5133 \| struct ssl_counters counters, counters_px; \| ^~~~~~~~~~~ Either a listener or a server are expected there, so ther counters are always initialized and the compiler cannot know this. Let's preset them and test before updating the counter, we're not in a hot path here. No backport is needed.	2020-11-06 13:22:44 +01:00
Willy Tarreau	f5fe70620c	MINOR: server: remove idle lock in srv_cleanup_connections This function used to grab the idle lock when scanning the threads for idle connections, but it doesn't need it since the lock only protects the tree. Let's remove it.	2020-11-06 13:22:44 +01:00
Amaury Denoyelle	d0447a7c3e	MINOR: ssl: add counters for ssl sessions Add counters for newly established and resumed sessions.	2020-11-06 12:05:17 +01:00
Amaury Denoyelle	fbc3377cd4	MINOR: ssl: count client hello for stats Add a counter for ssl client_hello received on frontends.	2020-11-06 12:05:17 +01:00
Amaury Denoyelle	9963fa74d2	MINOR: ssl: instantiate stats module This module is responsible for providing statistics for ssl. It allocates counters for frontend/backend/listener/server objects.	2020-11-06 12:05:17 +01:00
Christopher Faulet	a66adf41ea	MINOR: http-htx: Add understandable errors for the errorfiles parsing No details are provided when an error occurs during the parsing of an errorfile, Thus it is a bit hard to diagnose where the problem is. Now, when it happens, an understandable error message is reported. This patch is not a bug fix in itself. But it will be required to change an fatal error into a warning in last stable releases. Thus it must be backported as far as 2.0.	2020-11-06 09:13:58 +01:00
Willy Tarreau	6d27a92b83	BUG/MINOR: ssl: don't report 1024 bits DH param load error when it's higher The default dh_param value is 2048 and it's preset to zero unless explicitly set, so we must not report a warning about DH param not being loadble in 1024 bits when we're going to use 2048. Thanks to Dinko for reporting this. This should be backported to 2.2.	2020-11-05 19:40:14 +01:00
Jerome Magnin	eff2e0a958	CLEANUP: cfgparse: remove duplicate registration for transparent build options Since commit `37bafdcbb` ("MINOR: sock_inet: move the IPv4/v6 transparent mode code to sock_inet"), build options for transparent proxying are registered twice. This patch removes the older one.	2020-11-05 19:27:16 +01:00
Willy Tarreau	38d41996c1	MEDIUM: pattern: turn the pattern chaining to single-linked list It does not require heavy deletion from the expr anymore, so we can now turn this to a single-linked list since most of the time we want to delete all instances of a given pattern from the head. By doing so we save 32 bytes of memory per pattern. The pat_unlink_from_head() function was adjusted accordingly.	2020-11-05 19:27:09 +01:00
Willy Tarreau	867a8a5a10	MINOR: pattern: prepare removal of a pattern from the list head Instead of using LIST_DEL() on the pattern itself inside an expression, we look it up from its head. The goal is to get rid of the double-linked list while this usage remains exclusively for freeing on startup error!	2020-11-05 19:27:09 +01:00
Willy Tarreau	2817472bb0	MINOR: pattern: during reload, delete elements frem the ref, not the expression Instead of scanning all elements from the expression and using the slow delete path there, let's use the faster way which involves pat_delete_gen() while the elements are detached from ther reference.	2020-11-05 19:27:09 +01:00
Willy Tarreau	ae83e63b48	MEDIUM: pattern: make pat_ref_prune() rely on pat_ref_purge_older() When purging all of a reference, it's much more efficient to scan the reference patterns from the reference head and delete all derivative patterns than to scan the expressions. The only thing is that we need to proceed both for the current and next generations, in case there is a huge gap between the two. With this, purging 20M IP addresses in small batches of 100 takes roughly 3 seconds.	2020-11-05 19:27:09 +01:00
Willy Tarreau	94b9abe200	MINOR: pattern: add pat_ref_purge_older() to purge old entries This function will be usable to purge at most a specified number of old entries from a reference. Entries are declared old if their generation number is in the past compared to the one passed in argument. This will ease removal of early entries when new ones have been appended. We also call malloc_trim() when available, at the end of the series, because this is one place where there is a lot of memory to save. Reloads of 1M IP addresses used in an ACL made the process grow up to 1.7 GB RSS after 10 reloads and roughly stabilize there without this call, versus only 260 MB when the call is present. Sadly there is no direct equivalent for jemalloc, which stabilizes around 800MB-1GB.	2020-11-05 19:27:09 +01:00
Willy Tarreau	1a6857b9c1	MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation pat_ref_load() basically combines pat_ref_append() and pat_ref_commit(). It's very similar to pat_ref_add() except that it also allows to set the generation ID and the line number. pat_ref_add() was modified to directly rely on it to avoid code duplication. Note that a previous declaration of pat_ref_load() was removed as it was just a leftover of an earlier incarnation of something possibly similar, so no existing functionality was changed here.	2020-11-05 19:27:09 +01:00
Willy Tarreau	0439e5eeb4	MINOR: pattern: add pat_ref_commit() to commit a previously inserted element This function will be used after a successful pat_ref_append() to propagate the pattern to all use places (including parsing and indexing). On failure, it will entirely roll back all insertions and free the pattern itself. It also preserves the generation number so that it is convenient for use in association with pat_ref_append(). pat_ref_add() was modified to rely on it instead of open-coding the insertion and roll-back.	2020-11-05 19:27:09 +01:00
Willy Tarreau	c93da6950e	MEDIUM: pattern: only match patterns that match the current generation Instead of matching any pattern found in the tree, only match those matching the current generation of entries. This will make sure that reloads are atomic, regardless of the time they take to complete, and that newly added data are not matched until the whole reference is committed. For consistency we proceed the same way on "show map" and "show acl". This will have no impact for now since generations are not used.	2020-11-05 19:27:09 +01:00
Willy Tarreau	29947745b5	MINOR: pattern: store a generation number in the reference patterns Right now it's not possible to perform a safe reload because we don't know what patterns were recently added or were already present. This patch adds a generation counter to the reference patterns so that it is possible to know what generation of the reference they were loaded with. A reference now has two generations, the current one, used for all additions, and the next one, allocated to those wishing to update the contents. The generation wraps at 2^32 so comparisons must be made relative to the current position. The idea will be that upon full reload, the caller will first get a new generation ID, will insert all new patterns using it, will then switch the current ID to the new one, and will delete all entries older than the current ID. This has the benefit of supporting chunked updates that remain consistent and that won't block the whole process for ages like pat_ref_reload() currently does.	2020-11-05 19:27:09 +01:00
Willy Tarreau	1fd52f70e5	MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference Till now the only way to remove a known reference was via pat_ref_delete_by_id() which scans the whole list to find a matching pointer. Let's add pat_ref_delete_by_ptr() which takes a valid pointer. It can be called by the function above after the pointer is found, and can also be used to roll back a failed insertion much more efficiently.	2020-11-05 19:27:09 +01:00
Willy Tarreau	a98b2882ac	CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete() These ones are not used anymore, so let's remove them to remove a bit of the complexity. The ACL keyword's delete() function could be removed as well, though most keyword declarations are positional and we have a high risk of introducing a mistake here, so let's not touch the ACL part.	2020-11-05 19:27:09 +01:00
Willy Tarreau	b35aa9b256	CLEANUP: acl: don't reference the generic pattern deletion function anymore A few ACL keyword used to reference pat_delete_gen() as the deletion function but this is not needed since it's the default one now. Let's just remove this reference.	2020-11-05 19:27:09 +01:00
Willy Tarreau	e828d8f0e8	MINOR: pattern: perform a single call to pat_delete_gen() under the expression When we're removing an element under the expression lock, we don't need anymore to run over all ->delete() functions via the expressions, since we know that the single function does it fine now. Note that at this point, pattern->delete() is not used at all through out the code anymore.	2020-11-05 19:27:09 +01:00
Willy Tarreau	f1c0892aa6	MINOR: pattern: remerge the list and tree deletion functions pat_del_tree_gen() was already chained onto pat_del_list_gen() to deal with remaining cases, so let's complete the merge and have a generic pattern deletion function acting on the reference and taking care of reliably removing all elements.	2020-11-05 19:27:09 +01:00
Willy Tarreau	78777ead32	MEDIUM: pattern: change the pat_del_* functions to delete from the references This is the next step in speeding up entry removal. Now we don't scan the whole lists or trees for elements pointing to the target reference, instead we start from the reference and delete all linked patterns. This simplifies some delete functions since we don't need anymore to delete multiple times from an expression since all nodes appear after the reference element. We can now have one generic list and one generic tree deletion function. This required the replacement of pattern_delete() with an open-coded version since we now need to lock all expressions first before proceeding. This means there is a high risk of lock inversion here but given that the expressions are always scanned in the same order from the same head, this must not happen. Now deleting first entries is instantaneous, and it's still slow to delete the last ones when looking up their ID since it still requires to look them up by a full scan, but it's already way faster than previously. Typically removing the last 10 IP from a 20M entries ACL with a full-scan each took less than 2 seconds. It would be technically possible to make use of indexed entries to speed up most lookups for removal by value (e.g. IP addresses) but that's for later.	2020-11-05 19:27:09 +01:00
Willy Tarreau	4bdd0a13d6	MEDIUM: pattern: link all final elements from the reference There is a data model issue in the current pattern design that makes pattern deletion extremely expensive: there's no direct way from a reference to access all indexed occurrences. As such, the only way to remove all indexed entries corresponding to a reference update is to scan all expressions's lists and trees to find a link to the reference. While this was possibly OK when map removal was not common and most maps were small, this is not conceivable anymore with GeoIP maps containing 10M+ entries and del-map operations that are triggered from http-request rulesets. This patch introduces two list heads from the pattern reference, one for the objects linked by lists and one for those linked by tree node. Ideally a single list would be enough but the linked elements are too much unrelated to be distinguished at the moment, so we'll need two lists. However for the long term a single-linked list will suffice but for now it's not possible due to the way elements are removed from expressions. As such this patch adds 32 bytes of memory usage per reference plus 16 per indexed entry, but both will be cut in half later. The links are not yet used for deletion, this patch only ensures the list is always consistent.	2020-11-05 19:27:09 +01:00
Willy Tarreau	6d8a68914e	MINOR: pattern: make the delete and prune functions more generic Now we have a single prune() function to act on an expression, and one delete function for the lists and one for the trees. The presence of a pointer in the lists is enough to warrant a free, and we rely on the PAT_SF_REGFREE flag to decide whether to free using free() or regfree().	2020-11-05 19:27:09 +01:00
Willy Tarreau	9b5c8bbc89	MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed Currently we have no way to know how to delete/prune a pattern in a generic way. A pattern doesn't contain its own type so we don't know what function to call. Tree nodes are roughly OK but not lists where regex are possible. Let's add one new bit for sflags at index time to indicate that regex_free() will be needed upon deletion. It's not used for now.	2020-11-05 19:27:08 +01:00
Willy Tarreau	d4164dcd4a	CLEANUP: pattern: delete the back refs at once during pat_ref_reload() It's pointless to delete a backref and relink it to the next entry since the next entry is going to do the exact same and so on until all of them are deleted. Let's simply delete backrefs on reload.	2020-11-05 19:27:08 +01:00
Willy Tarreau	3ee0de1b41	MINOR: pattern: move the update revision to the pat_ref, not the expression It's not possible to uniquely update a single expression without updating the pattern reference, I don't know why we've put the revision in the expression back then, given that it in fact provides an update for a full pattern. Let's move the revision into the reference's head instead.	2020-11-05 19:27:08 +01:00
Willy Tarreau	114d698fde	MEDIUM: pattern: call malloc_trim() on pat_ref_reload() This is one case where we may release large amounts of data at once. Tests show that without this, after 10 full reloads of an ACL containing 1M IP addresses, the memory usage grew and stabilized around 1.7 GB of RSS. With this change, it stays around 260 MB and is stable across reloads.	2020-11-05 19:27:08 +01:00
Willy Tarreau	88366c2926	MEDIUM: pools: call malloc_trim() from pool_gc() If available it definitely makes sense to call it since it's also called when stopping to reclaim the maximum possible memory.	2020-11-05 19:27:08 +01:00
Baptiste Assmann	e279ca6bbe	MINOR: sample: Add converts to parses MQTT messages This patch implements a couple of converters to validate and extract data from a MQTT (Message Queuing Telemetry Transport) message. The validation consists of a few checks as well as "packet size" validation. The extraction can get any field from the variable header and the payload. This is limited to CONNECT and CONNACK packet types only. All other messages are considered as invalid. It is not a problem for now because only the first packet on each side can be parsed (CONNECT for the client and CONNACK for the server). MQTT 3.1.1 and 5.0 are supported. Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>	2020-11-05 19:27:03 +01:00
Baptiste Assmann	e138dda1e0	MINOR: sample: Add converters to parse FIX messages This patch implements a couple of converters to validate and extract tag value from a FIX (Financial Information eXchange) message. The validation consists in a few checks such as mandatory fields and checksum computation. The extraction can get any tag value based on a tag string or tag id. This patch requires the istend() function. Thus it depends on "MINOR: ist: Add istend() function to return a pointer to the end of the string". Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>	2020-11-05 19:26:30 +01:00
Ilya Shipitsin	0aa8c29460	BUILD: ssl: use feature macros for detecting ec curves manipulation support Let us use SSL_CTX_set1_curves_list, defined by OpenSSL, as well as in openssl-compat when SSL_CTRL_SET_CURVES_LIST is present (BoringSSL), for feature detection instead of versions.	2020-11-05 15:08:41 +01:00
William Lallemand	99e0bb997f	MINOR: mworker/cli: the master CLI use its own applet Following the patch b4daee ("MINOR: sock: add a check against cross worker<->master socket activities"), this patch adds a dedicated applet for the master CLI. It ensures that the CLI connection can't be used with the master rights in the case of bugs.	2020-11-05 10:28:53 +01:00
Willy Tarreau	21b9ff59b2	BUG/MEDIUM: server: make it possible to kill last idle connections In issue #933, @jaroslawr provided a report indicating that when using many threads and many servers, it's very difficult to terminate the last idle connections on each server. The issue has two causes in fact. The first one is that during the calculation of the estimate of needed connections, we round the computation up while in previous round it was already rounded up, so we end up adding 1 to 1 which once divided by 2 remains 1. The second issue is that servers are not woken up anymore for purging their connections if they don't have activity. The only reason that was there to wake them up again was in case insufficient connections were purged. And even then the purge task itself was not woken up. But that is not enough for getting rid of the long tail of old connections nor updating est_need_conns. This patch makes sure to properly wake up as long as at least one idle connection remains, and not to round up the needed connections anymore. Prior to this patch, a test involving many connections which suddenly stopped would keep many idle connections, now they're effectively halved every pool-purge-delay. This needs to be backported to 2.2.	2020-11-05 09:12:20 +01:00
Willy Tarreau	b4daeeb094	MINOR: sock: add a check against cross worker<->master socket activities Given that the previous issues caused spurious worker socket wakeups in the master for inherited FDs that couldn't be closed, let's add a strict test in the I/O callback to make sure that an accept() event is always caught by the appropriate type of process (master for master listeners, worker for worker listeners).	2020-11-04 15:05:50 +01:00
Christopher Faulet	fafd1b0a5b	CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream Since the h2 multiplexer no longer relies on the legacy HTTP representation, and uses exclusively the HTX, the H1 parser state (h1m) is no longer used by the h2 streams. Thus it can be removed. This patch may be backported as far as 2.1.	2020-11-04 15:02:24 +01:00
Willy Tarreau	a4380b211f	MEDIUM: listeners: make use of fd_want_recv_safe() to enable early receivers We used to refrain from calling fd_want_recv() if fd_updt was not allocated but it's not the right solution as this does not allow the FD to be set. Instead, let's use the new fd_want_recv_safe() which will update the FD and create an update entry only if possible. In addition, the equivalent test before calling fd_stop_recv() was removed as totally useless since there's not fd_updt creation in this case.	2020-11-04 14:22:42 +01:00
Willy Tarreau	22ccd5ebaf	BUG/MEDIUM: listener: make the master also keep workers' inherited FDs In commit `374e9af35` ("MEDIUM: listener: let do_unbind_listener() decide whether to close or not") it didn't appear necessary to have the master process keep open the workers' inherited FDs. But this is actually necessary to handle the reload on "bind fd@foo" situations, otherwise the FD may be reassigned and the new socket cannot be set up, sometimes causing "socket operation on non-socket" or other types of errors. William found that this was the cause for the consistent failures of the abns regtest, which already used to fail very often before this and was as such marked as broken. Interestingly I didn't have this issue with my test configs because the FD number I used was higher and within the range of other listening sockets. But this means that one of these wouldn't work as expected. No backport is needed, this was introduced as part of the listeners rework in 2.3.	2020-11-04 14:22:42 +01:00
Willy Tarreau	59b5da4873	BUG/MEDIUM: listener: never suspend inherited sockets It is not acceptable to suspend an inherited socket because we'd kill its listening state, making it possibly unrecoverable for future processes. The situation which can trigger this is when there is an abns socket in a config and an inherited FD on another listener. Upon soft reload, the abns fails to bind, a SIGTTOU is sent to the old process which suspends everything, including the inherited FD, then the new process can bind and tell the old one to quit. Except that the new FD was not set back to the listen state, which is detected by listener_accept() which can pause it. It's only upon second reload that the FD works again. The solution is to refrain from suspending such FDs since we don't own them. And the next process will get them right anyway from its config. For now only TCP and UDP face this issue so it's better to address this on a protocol basis No backport is needed, this is related to the new listeners in 2.3.	2020-11-04 14:22:42 +01:00
Willy Tarreau	38dba27d4d	BUG/MEDIUM: listener: only enable a listening listener if needed The test on listener->state == LI_LISTEN is not sufficient to decide if we need to enable a listener. Indeed, there is a very special case which is the inherited FD shared, which has to reflect the real socket state even after the previous test, and as such needs to remain in LI_LISTEN state. In this case we don't want a worker to start the master's listener nor conversely. Let's add a specific test for this.	2020-11-04 14:22:42 +01:00
Willy Tarreau	dfe79251da	BUG/MEDIUM: stick-table: limit the time spent purging old entries An interesting case was reported with threads and moderately sized stick-tables. Sometimes the watchdog would trigger during the purge. It turns out that the stick tables were sized in the 10s of K entries which is the order of magnitude of the possible number of connections, and that threads were used over distinct NUMA nodes. While at first glance nothing looks problematic there, actually there is a risk that a thread trying to purge the table faces 100% of entries still in use by a connection with (ts->ref_cnt > 0), and ends up scanning the whole table, while other threads on the other NUMA node are causing the cache lines to bounce back and forth and considerably slow down its progress to the point of possibly spending hundreds of milliseconds there, multiplied by the number of queued threads all failing on the same point. Interestingly, smaller tables would not trigger it because the scan would be faster, and larger ones would not trigger it because plenty of entries would be idle! The most efficient solution is to increase the table size to be large enough for this never to happen, but this is not reliable. We could have a parallel list of idle entries but that would significantly increase the storage and processing cost only to improve a few rare corner cases. This patch takes a more pragmatic approach, it considers that it will not visit more than twice the number of nodes to be deleted, which means that it accepts to fail up to 50% of the time. Given that very small batches are programmed each time (1/256 of the table size), this means the operation will finish quickly (128 times faster than now), and will reduce the inter-thread contention. If this needs to be reconsidered, it will probably mean that the batch size needs to be fixed differently. This needs to be backported to stable releases which extensively use threads, typically 2.0. Kudos to Nenad Merdanovic for figuring the root cause triggering this!	2020-11-03 18:02:42 +01:00
Amaury Denoyelle	e6ee820c07	MINOR: stats: do not display empty stat module title on html If a stat module is not available on the current proxy scope, do not display its title on the related html box. This is clearer for the user.	2020-11-03 17:04:22 +01:00
Amaury Denoyelle	e7b891f7d3	MINOR: mux_h2: add stat for total count of connections/streams Add counters for total number of http2 connections/stream since haproxy startup. Contrary to open_conn/stream, they are never reset to zero.	2020-11-03 17:04:22 +01:00
Amaury Denoyelle	2ac34d97a6	MINOR: mux_h2: capitalize frame type in stats http/2 frame type names are capitalized in the rfc, use the same notation on the stats labels.	2020-11-03 17:04:22 +01:00
Christopher Faulet	743bd6adc8	BUG/MINOR: filters: Skip disabled proxies during startup only This partially reverts the patch `400829cd2` ("BUG/MEDIUM: filters: Don't try to init filters for disabled proxies"). Disabled proxies must not be skipped in flt_deinit() and flt_deinit_all_per_thread() when HAProxy is stopped because, obvioulsy, at this step, all proxies appear as disabled (or stopped, it is the same state). It is safe to do so because, during startup, filters declared on disabled proxies are removed. Thus they don't exist anymore during shutdown. This patch must be backported in all versions where the patch above is.	2020-11-03 16:51:48 +01:00
Ilya Shipitsin	04a5a440b8	BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions let us use HAVE_OPENSSL_KEYLOG for feature detection instead of versions	2020-11-03 14:54:15 +01:00
Christopher Faulet	5a7ca29061	BUG/MEDIUM: mux-pt: Release the tasklet during an HTTP upgrade When a TCP connection is upgraded to HTTP, the passthrough multiplexer owning the client connection is detroyed and replaced by an HTTP multiplexer. When it happens, the connection context is changed (it is in fact the mux itself). Thus, when the mux-pt is destroyed, the connection is not released. But, only the connection must be kept. Everything else concerning the mux must be released. Especially, the tasklet used for I/O subscriptions. In this part, there was a bug and the tasklet was never released. This patch should fix the issue #935. It must be backported as far as 2.0.	2020-11-03 10:50:00 +01:00
Christopher Faulet	75bef00538	MINOR: server: Copy configuration file and line for server templates When servers based on server templates are initialized, the configuration file and line are now copied. This helps to emit understandable warning and alert messages. This patch may be backported if needed, as far as 1.8.	2020-11-03 10:44:38 +01:00
Christopher Faulet	ac1c60fd9c	BUG/MINOR: server: Set server without addr but with dns in RMAINT on startup On startup, if a server has no address but the dns resolutions are configured, "none" method is added to the default init-addr methods, in addition to "last" and "libc". Thus on startup, this server is set to RMAINT mode if no address is found. It is only performed if no other init-addr method is configured. Setting the RMAINT mode on startup is important to inhibit the health checks. For instance, following servers will now be set to RMAINT mode on startup : server srv nofound.tld:80 check resolvers mydns server srv _http._tcp.service.local check resolvers mydns server-template srv 1-3 _http._tcp.service.local check resolvers mydns while followings ones will trigger an error : server srv nofound.tld:80 check server srv nofound.tld:80 check resolvers mydns init-addr libc server srv _http._tcp.service.local check server srv _http._tcp.service.local check resolvers mydns init-addr libc server-template srv 1-3 _http._tcp.service.local check resolvers mydns init-addr libc This patch must be backported as far as 1.8.	2020-11-03 10:44:26 +01:00
Christopher Faulet	5e29376efb	BUG/MINOR: checks: Report a socket error before any connection attempt When a health-check fails, if no connection attempt was performed, a socket error must be reported. But this was only done if the connection was not allocated. It must also be done if there is no control layer. Otherwise, a L7TOUT will be reported instead. It is possible to not having a control layer for a connection if the connection address family is invalid or not defined. This patch must be backported to 2.2.	2020-11-03 10:23:00 +01:00
Christopher Faulet	d5bd824b81	BUG/MINOR: proxy/server: Skip per-proxy/server post-check for disabled proxies per-proxy and per-server post-check callback functions must be skipped for disabled proxies because most of the configuration validity check is skipped for these proxies. This patch must be backported as far as 2.1.	2020-11-03 10:23:00 +01:00
Christopher Faulet	400829cd2c	BUG/MEDIUM: filters: Don't try to init filters for disabled proxies Configuration is parsed for such proxies but not validated. Concretely, it means check_config_validity() function does almost nothing for such proxies. Thus, we must be careful to not initialize filters for disabled proxies because the check callback function is not called. In fact, to be sure to avoid any trouble, filters for disabled proxies are released. This patch fixes a segfault at startup if the SPOE is configured for a disabled proxy. It must be backported as far as 1.7 (maybe with some adaptations).	2020-11-03 10:23:00 +01:00
Ilya Shipitsin	c9dfee43f3	BUILD: ssl: use SSL_CTRL_GET_RAW_CIPHERLIST instead of OpenSSL versions let us use SSL_CTRL_GET_RAW_CIPHERLIST for feature detection instead of versions [wla: SSL_CTRL_GET_RAW_CIPHERLIST was introduced by OpenSSL commit 94a209 along with SSL_CIPHER_find. It was removed in boringSSL.] Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2020-11-03 09:24:43 +01:00
Willy Tarreau	a5bbaaf9f4	CLEANUP: pattern: fix spelling/grammatical/copy-paste in comments The code is horrible to work with because most functions are documented with misleading comments resulting from many spelling and grammatical mistakes, and plenty of remains of copy-paste mentioning arguments that do not exist and return values that are never set. Too many hours wasted writing non-working code because of assumptions resulting from this, let's fix this once for all now!	2020-10-31 13:14:10 +01:00
Willy Tarreau	8135d9bc0c	CLEANUP: pattern: use calloc() rather than malloc for structures It's particularly difficult to make sure that the various pattern structures are properly initialized given that they can be allocated at multiple places and systematically via malloc() instead of calloc(), thus not even leaving the possibility of default values. Let's adjust a few of them.	2020-10-31 13:14:10 +01:00
Willy Tarreau	6bedf151e1	MINOR: pattern: export pat_ref_push() Strangely this one was marked static inline within the file itself. Let's export it.	2020-10-31 13:13:48 +01:00
Willy Tarreau	6a1740767c	MINOR: pattern: make pat_ref_add() rely on pat_ref_append() Let's remove unneeded code duplication, both are exactly the same.	2020-10-31 13:13:48 +01:00
Willy Tarreau	f4edb72e0a	MINOR: pattern: make pat_ref_append() return the newly added element It's more convenient to return the element than to return just 0 or 1, as the next thing we'll want to do is to act on this element! In addition it was using variable arguments instead of consts, causing some reuse constraints which were also addressed. This doesn't change its use as a boolean, hence why call places were not modified.	2020-10-31 13:13:48 +01:00
Remi Tricot-Le Breton	8c2db71326	BUG/MINOR: cache: Inverted variables in http_calc_maxage function The maxage and smaxage variables were inadvertently assigned the Cache-Control s-maxage and max-age values respectively when it should have been the other way around. This can be backported on all branches after 1.8 (included).	2020-10-30 14:29:29 +01:00
Remi Tricot-Le Breton	40ed97b04b	BUG/MINOR: cache: Manage multiple values in cache-control header value If an HTTP request or response had a "Cache-Control" header that had multiple comma-separated subparts in its value (like "max-age=1, no-store" for instance), we did not process the values correctly and only parsed the first one. That made us store some HTTP responses in the cache when they were explicitely uncacheable. This patch replaces the way the values are parsed by an http_find_header loop that manages every sub part of the value independently. This patch should be backported to 2.2 and 2.1. The bug also exists on previous versions but since the sources changed, a new commit will have to be created. [wla: This patch requires `bb4582c` ("MINOR: ist: Add a case insensitive istmatch function"). Backporting for < 2.1 is not a requirement since it works well enough for most cases, it was a known limitation of the implementation of non-htx version too]	2020-10-30 13:28:34 +01:00
Remi Tricot-Le Breton	a6476114ec	MINOR: cache: Add Expires header value parsing When no Cache-Control max-age or s-maxage information is present in a cached response, we need to parse the Expires header value (RFC 7234#5.3). An invalid Expires date value or a date earlier than the reception date will make the cache_entry stale upon creation. For now, the Cache-Control and Expires headers are parsed after the insertion of the response in the cache so even if the parsing of the Expires results in an already stale entry, the entry will exist in the cache.	2020-10-30 11:08:38 +01:00
Amaury Denoyelle	bc0af6a199	BUG/MINOR: lua: initialize sample before using it Memset the sample before using it through hlua_lua2smp. This function is ORing the smp.flags, so this field need to be cleared before its use. This was reported by a coverity warning. Fixes the github issue #929. This bug can be backported up to 1.8.	2020-10-29 18:52:44 +01:00
Amaury Denoyelle	e6ba7915eb	BUG/MINOR: server: fix down_time report for stats Adjust condition used to report down_time for statistics. There was a tiny probabilty to have a negative downtime if last_change was superior to now. If this is the case, return only down_time. This bug can backported up to 1.8.	2020-10-29 18:52:39 +01:00
Amaury Denoyelle	fe2bf091f6	BUG/MINOR: server: fix srv downtime calcul on starting When a server is up after a failure, its downtime was reset to 0 on the statistics. This is due to a wrong condition that causes srv.down_time to never be set. Fix this by updating down_time each time the server is in STARTING state. Fixes the github issue #920. This bug can be backported up to 1.8.	2020-10-29 18:52:18 +01:00
Amaury Denoyelle	66942c1d4d	MINOR: mux-h2: count open connections/streams on stats Implement as a gauge h2 counters for currently open connections and streams. The counters are decremented when closing the stream or the connection.	2020-10-28 08:55:23 +01:00
Amaury Denoyelle	a8879238ce	MINOR: mux-h2: report detected error on stats Implement counters for h2 protocol error on connection or stream level. Also count the total number of rst_stream and goaway frames sent by the mux in response to a detected error.	2020-10-28 08:55:19 +01:00
Amaury Denoyelle	2dec1ebec2	MINOR: mux-h2: add stats for received frame types Implement counters for h2 frame received based on their type for HEADERS, DATA, SETTINGS, RST_STREAM and GOAWAY.	2020-10-28 08:55:16 +01:00
Amaury Denoyelle	c92697d977	MINOR: mux-h2: add counters instance to h2c Add pointer to counters as a member for h2c structure. This pointer is initialized on h2_init function. This is useful to quickly access and manipulate the counters inside every h2 functions.	2020-10-28 08:55:11 +01:00
Amaury Denoyelle	3238b3f906	MINOR: mux-h2: register a stats module Use statistics API to register a new stats module generating counters on h2 module. The counters are attached to frontend/backend instances.	2020-10-28 08:55:07 +01:00
Remi Tricot-Le Breton	bf97121f1c	MINOR: cache: Create res.cache_hit and res.cache_name sample fetches Res.cache_hit sample fetch returns a boolean which is true when the HTTP response was built out of a cache. The cache's name is returned by the res.cache_name sample_fetch. This resolves GitHub issue #900.	2020-10-27 18:25:43 +01:00
Remi Tricot-Le Breton	53161d81b8	MINOR: cache: Process the If-Modified-Since header in conditional requests If a client sends a conditional request containing an If-Modified-Since header (and no If-None-Match header), we try to compare the date with the one stored in the cache entry (coming either from a Last-Modified head, or a Date header, or corresponding to the first response's reception time). If the request's date is earlier than the stored one, we send a "304 Not Modified" response back. Otherwise, the stored is sent (through a 200 OK response). This resolves GitHub issue #821.	2020-10-27 18:10:25 +01:00
Remi Tricot Le Breton	27091b4dd0	MINOR: cache: Store the "Last-Modified" date in the cache_entry In order to manage "If-Modified-Since" requests, we need to keep a reference time for our cache entries (to which the conditional request's date will be compared). This reference is either extracted from the "Last-Modified" header, or the "Date" header, or the reception time of the response (in decreasing order of priority). The date values are converted into seconds since epoch in order to ease comparisons and to limit storage space.	2020-10-27 18:10:25 +01:00
Tim Duesterhus	e0142340b2	BUG/MINOR: cache: Check the return value of http_replace_res_status Send the full body if the status `304` cannot be applied. This should be the most graceful failure. Specific for 2.3, no backport needed.	2020-10-27 17:01:49 +01:00
Ilya Shipitsin	b9b84a4b25	BUILD: ssl: more elegant OpenSSL early data support check BorinSSL pretends to be 1.1.1 version of OpenSSL. It messes some version based feature presense checks. For example, OpenSSL specific early data support. Let us change that feature detction to SSL_READ_EARLY_DATA_SUCCESS macro check instead of version comparision.	2020-10-27 13:08:32 +01:00
Willy Tarreau	a0133fcf35	BUG/MINOR: log: fix risk of null deref on error path Previous commit `ae32ac74db` ("BUG/MINOR: log: fix memory leak on logsrv parse error") addressed one issue and introduced another one, the logsrv pointer may also be null at the end of the function so we must test it before deciding to dereference it. This should be backported along with the patch above to 2.2.	2020-10-27 10:35:32 +01:00
Willy Tarreau	ae32ac74db	BUG/MINOR: log: fix memory leak on logsrv parse error In case of parsing error on logsrv, we can leave parse_logsrv() without releasing logsrv->ring_name or smp_rgs. Let's free them on the error path. This should fix issue #926 detected by Coverity. The impact is only a tiny leak just before reporting a fatal error, so it will essentially annoy valgrind. This can be backported to 2.0 (just drop the ring part).	2020-10-27 09:55:00 +01:00
Emmanuel Hocdet	a73a222a98	BUG/MEDIUM: ssl: OCSP must work with BoringSSL It's a regression from `b3201a3e` "BUG/MINOR: disable dynamic OCSP load with BoringSSL". The origin bug is link to `76b4a12` "BUG/MEDIUM: ssl: memory leak of ocsp data at SSL_CTX_free()": ssl_sock_free_ocsp() shoud be in #ifndef OPENSSL_IS_BORINGSSL. To avoid long #ifdef for small code, the BoringSSL part for ocsp load is isolated in a simple #ifdef. This must be backported in 2.2 and 2.1	2020-10-27 09:38:51 +01:00
William Dauchy	5e10e44bce	CLEANUP: http_ana: remove unused assignation of `att_beg` `att_beg` is assigned to `next` at the end of the `for` loop, but is assigned to `prev` at the beginning of the loop, which is itself assigned to `next` after each loop. So it represents a double assignation for the same value. Also `att_beg` is not used after the end of the loop. this is a partial fix for github issue #923, all the others could probably be marked as intentional to protect future changes. no backport needed. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-10-26 15:00:09 +01:00
Willy Tarreau	b3250a268b	BUG/MINOR: extcheck: add missing checks on extchk_setenv() Issue #910 reports that we fail to check a few extchk_setenv() in the child process. These are mostly harmless, but instead of counting on the external check script to fail the dirty way, better fail cleanly when detecting the failure. This could probably be backported to all stable branches.	2020-10-24 13:07:39 +02:00
Willy Tarreau	5472aa50f1	BUG/MEDIUM: queue: fix unsafe proxy pointer when counting nbpend As reported by Coverity in issue #917, commit `96bca33` ("OPTIM: queue: decrement the nbpend and totpend counters outside of the lock") introduced a bug when moving the increments outside of the loop, because we can't always rely on the pendconn "p" here as it may be null. We can retrieve the proxy pointer directly from s->proxy instead. The same is true for pendconn_redistribute(), though the last "p" pointer there was still valid. This patch fixes both. No backport is needed, this was introduced just before 2.3-dev8.	2020-10-24 12:57:41 +02:00
Willy Tarreau	bd71510024	MINOR: stats: report server's user-configured weight next to effective weight The "weight" column on the stats page is somewhat confusing when using slowstart becaue it reports the effective weight, without being really explicit about it. In some situations the user-configured weight is more relevant (especially with long slowstarts where it's important to know if the configured weight is correct). This adds a new uweight stat which reports a server's user-configured weight, and in a backend it receives the sum of all servers' uweights. In addition it adds the mention of "effective" in a few descriptions for the "weight" column (help and doc). As a result, the list of servers in a backend is now always scanned when dumping the stats. But this is not a problem given that these servers are already scanned anyway and for way heavier processing.	2020-10-23 22:47:30 +02:00
William Lallemand	089c13850f	MEDIUM: ssl: ssl-load-extra-del-ext work only with .crt In order to be compatible with the "set ssl cert" command of the CLI, this patch restrict the ssl-load-extra-del-ext to files with a ".crt" extension in the configuration. Related to issue #785. Should be backported where `8e8581e` ("MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension") was backported.	2020-10-23 18:41:08 +02:00
Willy Tarreau	2fbe6940f4	MINOR: stats: indicate the number of servers in a backend's status When dumping the stats page (or the CSV output), when many states are mixed, it's hard to figure the number of up servers. But when showing only the "up" servers or hiding the "maint" servers, there's no way to know how many servers are configured, which is problematic when trying to update server-templates. What this patch does, for dumps in "up" or "no-maint" modes, is to add after the backend's "UP" or "DOWN" state "(%d/%d)" indicating the number of servers seen as UP to the total number of servers in the backend. As such, seeing "UP (33/39)" immediately tells that there are 6 servers that are not listed when using "up", or will let the client figure how many servers are left once deducted the number of non-maintenance ones. It's not done on default dumps so as not to disturb existing tools, which already have all the information they need in the dump.	2020-10-23 18:11:30 +02:00
Willy Tarreau	3e32036701	MINOR: stats: also support a "no-maint" show stat modifier "no-maint" is a bit similar to "up" except that it will only hide servers that are in maintenance (or disabled in the configuration), and not those that are enabled but failed a check. One benefit here is to significantly reduce the output of the "show stat" command when using large server-templates containing entries that are not yet provisioned. Note that the prometheus exporter also has such an option which does the exact same.	2020-10-23 18:11:24 +02:00
Willy Tarreau	65141ffc4f	MINOR: stats: support the "up" output modifier for "show stat" We already had it on the HTTP interface but it was not accessible on the CLI. It can be very convenient to hide servers which are down, do not resolve, or are in maintenance.	2020-10-23 18:11:24 +02:00
Willy Tarreau	8ae8c48eb0	MEDIUM: fwlc: re-enable per-server queuing up to maxqueue Leastconn has the nice propery of being able to sort servers by their current usage. It's really a shame to force all requests into the backend queue when the algo would be able to also consider their current queue. In order not to change existing behavior but extend it, this patch allows leastconn to elect servers which are already full if they have an explicitly configured maxqueue setting above zero and their queue hasn't reached that threshold. This will significantly reduce the pressure in the backend queue when queuing a lot with lots of servers. A test on 8 threads with 100 servers configured with maxconn 1 jumped from 165krps to 330krps with maxqueue 15 with this patch. This partially undoes commit `82cd5c13a` ("OPTIM: backend: skip LB when we know the backend is full") but allows to scale much better even by setting a single-digit maxqueue value. Some better heuristics could be used to maintain the behavior of the bypass in the patch above, consisting in keeping it if it's known that there is no server with a configured maxqueue in the farm (or in the backend).	2020-10-22 18:30:25 +02:00
Willy Tarreau	8c855f6cff	MINOR: leastconn: take the queue length into account when queuing servers When servers are queued into the leastconn tree, it's important to also consider their queue length. There could be some servers with lots of queued requests that we don't want to hammer with extra connections. In order not to add extra stress to the LB algorithm, we don't update the value when adding to the queue, only when updating the connection count (i.e. picking from the queue or releasing a connection). This will be sufficient to significantly improve the fairness in such situations.	2020-10-22 18:30:18 +02:00
Willy Tarreau	96bca33d75	OPTIM: queue: decrement the nbpend and totpend counters outside of the lock We don't need to do that inside the lock. However since the operation used to be done in deep functions, we have to make it resurface closer to visible parts. It remains reasonably self-contained in queue.c so that's not that big of a deal. Some places (redistribute) could benefit from a single operation for all counts at once. Others like pendconn_process_next_strm() are still called with both locks held but now it will be possible to change this.	2020-10-22 17:32:28 +02:00
Willy Tarreau	56c1cfb179	OPTIM: queue: make the nbpend counters atomic Instead of incrementing, decrementing them and updating their max under the lock, make them atomic and keep them out of the lock as much as possible. For __pendconn_unlink_* it would be wide to decide to move these counters outside of the function, inside the callers so that a single atomic op can be done per counter even for groups of operations.	2020-10-22 17:32:28 +02:00
Willy Tarreau	c7eedf7a5a	MINOR: queue: reduce the locked area in pendconn_add() Similarly to previous changes, we know if we're dealing with a server or proxy lock so let's directly lock at the finest possible places there. It's worth noting that a part of the operation consisting in an increment and update of a max could be done outside of the lock using atomic ops and a CAS.	2020-10-22 17:32:28 +02:00
Willy Tarreau	3e3ae2524d	MINOR: queue: split __pendconn_unlink() in per-srv and per-prx The function is called with the lock held and does too many tests for things that are already known from its callers. Let's split it in two so that its callers call either the per-server or per-proxy function depending on where the element is (since they had to determine it prior to taking the lock).	2020-10-22 17:32:28 +02:00
Willy Tarreau	5503908bdc	MINOR: proxy/cli: only take a read lock in "show errors" There's no point having an exclusive lock here, nothing is modified.	2020-10-22 17:32:28 +02:00
Willy Tarreau	595e767030	MINOR: server: read-lock the cookie during srv_set_dyncookie() No need to use an exclusive lock on the proxy anymore when reading its setting, a read lock is enough. A few other places continue to use a write-lock when modifying simple flags only in order to let this function see a consistent value all along. This might be changed in the future using barriers and local copies.	2020-10-22 17:32:28 +02:00
Willy Tarreau	ac66d6bafb	MINOR: proxy; replace the spinlock with an rwlock This is an anticipation of finer grained locking for the queues. For now all lock places take a write lock so that there is no difference at all with previous code.	2020-10-22 17:32:28 +02:00
Christopher Faulet	9a3d3fcb5d	BUG/MAJOR: mux-h2: Don't try to send data if we know it is no longer possible In h2_send(), if we are in a state where we know it is no longer possible to send data, we must exit the sending loop to avoid any possiblity to loop forever. It may happen if the mbuf ring is released while the H2_CF_MUX_MFULL flag is still set. Here is a possible scenario to trigger the bug : 1) The mbuf ring is full because we are unable to send data. The H2_CF_MUX_MFULL flag is set on the H2 connection. 2) At this stage, the task timeout expires because the H2 connection is blocked. We enter in h2_timeout_task() function. Because the mbuf ring is full, we cannot send the GOAWAY frame. Thus the H2_CF_GOAWAY_FAILED flag is set. The H2 connection is not released yet because there is still a stream attached. Here we leave h2_timeout_task() function. 3) A bit later, the H2 connection is woken up. If h2_process(), nothing is performed by the first attempt to send data, in h2_send(). Then, because the H2_CF_GOAWAY_FAILED flag is set, the mbuf ring is released. But the H2_CF_MUX_MFULL flag is still there. At this step a second attempt to send data is performed. 4) In h2_send(), we try to send data in a loop. To exist this loop, done variable must be set to 1. Because the H2_CF_MUX_MFULL flag is set, we don't call h2_process_mux() and done is not updated. Because the mbuf ring is now empty, nothing is sent and the H2_CF_MUX_MFULL flag is never removed. Now, we loop forever... waiting for the watchdog. To fix the bug, we now exit the loop if one of these conditions is true : - The H2_CF_GOAWAY_FAILED flag is set on the H2 connection - The CO_FL_SOCK_WR_SH flag is set on the underlying connection - The H2 connection is in the H2_CS_ERROR2 state This patch should fix the issue #912 and most probably #875. It must be backported as far as the 1.8.	2020-10-22 17:13:22 +02:00
Christopher Faulet	d6c48366b8	BUG/MINOR: http-ana: Don't send payload for internal responses to HEAD requests When an internal response is returned to a client, the message payload must be skipped if it is a reply to a HEAD request. The payload is removed from the HTX message just before the message forwarding. This bugs has been around for a long time. It was already there in the pre-HTX versions. In legacy HTTP mode, internal errors are not parsed. So this bug cannot be easily fixed. Thus, this patch should only be backported in all HTX versions, as far as 2.0. However, the code has significantly changed in the 2.2. Thus in the 2.1 and 2.0, the patch must be entirely reworked.	2020-10-22 17:13:22 +02:00
Tim Duesterhus	6414cd1fc0	CLEANUP: compression: Make use of http_get_etag_type() This commit makes the compressor use http_get_etag_type to validate the ETag instead of using an ad-hoc condition.	2020-10-22 16:59:36 +02:00
Remi Tricot-Le Breton	6cb10384a3	MEDIUM: cache: Add support for 'If-None-Match' request header Partial support of conditional HTTP requests. This commit adds the support of the 'If-None-Match' header (see RFC 7232#3.2). When a client specifies a list of ETags through one or more 'If-None-Match' headers, they are all compared to the one that might have been stored in the corresponding http cache entry until one of them matches. If a match happens, a specific "304 Not Modified" response is sent instead of the cached data. This response has all the stored headers but no other data (see RFC 7232#4.1). Otherwise, the whole cached data is sent. Although unlikely in a GET/HEAD request, the "If-None-Match: *" syntax is valid and also receives a "304 Not Modified" response (RFC 7434#4.3.2). This resolves a part of GitHub issue #821.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	dbb65b5a7a	MEDIUM: cache: Store the ETag information in the cache_entry When sent by a server for a given resource, the ETag header is stored in the coresponding cache entry (as any other header). So in order to perform future ETag comparisons (for subsequent conditional HTTP requests), we keep the length of the ETag and its offset relative to the start of the cache_entry. If no ETag header exists, the length and offset are zero.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	bcced09b91	MINOR: http: Add etag comparison function Add a function that compares two etags that might be of different types. If any of them is weak, the 'W/' prefix is discarded and a strict string comparison is performed. Co-authored-by: Tim Duesterhus <tim@bastelstu.be>	2020-10-22 16:06:20 +02:00
Willy Tarreau	1e690bb6c4	BUG/MEDIUM: server: support changing the slowstart value from state-file If the slowstart value in a state file implies the latest state change is within the slowstart period, we end up calling srv_update_status() to reschedule the server's state change but its task is not yet allocated and remains null, causing a crash on startup. Make sure srv_update_status() supports being called with partially initialized servers which do not yet have a task. If the task has to be scheduled, it will necessarily happen after initialization since it will result from a state change. This should be backported wherever server-state is present.	2020-10-22 12:07:07 +02:00
Willy Tarreau	ef71f0194c	BUG/MINOR: queue: properly report redistributed connections In commit `5cd4bbd7a` ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management") the counter of transferred connections was accidently lost, so that when a server goes down with connections in its queue, it will always be reported that 0 connection were transferred. This should be backported as far as 1.8 since the patch above was backported there.	2020-10-21 12:04:53 +02:00
William Lallemand	8e8581e242	MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension In issue #785, users are reporting that it's not convenient to load a ".crt.key" when the configuration contains a ".crt". This option allows to remove the extension of the certificate before trying to load any extra SSL file (.key, .ocsp, .sctl, .issuer etc.) The patch changes a little bit the way ssl_sock_load_files_into_ckch() looks for the file.	2020-10-20 18:25:46 +02:00
William Dauchy	835712ad90	BUG/MINOR: listener: close before free in `listener_accept` safer to close handle before the object is put back in the global pool. this was introduced by commit `9378bbe0be` ("MEDIUM: listener: use protocol->accept_conn() to accept a connection") this should fix github issue #902 no backport needed. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-10-20 15:40:36 +02:00
Willy Tarreau	f42d794d96	MEDIUM: config: report that "nbproc" is deprecated As previously discussed, nbproc usage is bad, deprecated, and scheduled for removal in 2.5. If "nbproc" is found with more than one process while nbthread is not set, a warning will be emitted encouraging to remove it or to migrate to nbthread instead. This makes sure the user has an opportunity to both see the message and silence it.	2020-10-20 11:54:49 +02:00
Willy Tarreau	69a7b8fc6c	CLEANUP: task: remove the unused and mishandled global_rqueue_size This counter is only updated and never used, and in addition it's done without any atomicity so it's very unlikely to be correct on multi-CPU systems! Let's just remove it since it's not used.	2020-10-19 14:08:13 +02:00
Willy Tarreau	3d18498645	CLEANUP: threads: don't register an initcall when not debugging It's a bit overkill to register an initcall to call a function to set a lock to zero when not debugging, let's just declare the lock as pre-initialized to zero.	2020-10-19 14:08:13 +02:00
Ilya Shipitsin	b3201a3e07	BUG/MINOR: disable dynamic OCSP load with BoringSSL it was accidently enabled on BoringSSL while actually it is not supported wla: Fix part of the issue mentionned in #895. It fixes build of boringSSL versions prior to commit https://boringssl.googlesource.com/boringssl/+/49e9f67d8b7cbeb3953b5548ad1009d15947a523 Must be backported in 2.2. Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2020-10-19 11:00:51 +02:00
Willy Tarreau	4b6e3c284a	MINOR: lb/chash: use a read lock in chash_get_server_hash() When using a low hash-balance-factor value, it's possible to loop many times trying to find the best server. Figures in the order of 100-300 times were observed for 1000 servers with a factor of 101 (which seems a bit excessive for such a large farm). Given that there's nothing in that function that prevents multiple threads from working in parallel, let's switch to a read lock. Tests on 8 threads show roughly a 2% performance increase with this.	2020-10-17 20:15:49 +02:00
Willy Tarreau	f76a21f78c	MINOR: lb/first: use a read lock in fas_get_next_server() The "first" algorithm creates a lot of contention because all threads focus on the same server by definition (the first available one). By turning the exclusive lock to a read lock in fas_get_next_server(), the request rate increases by 16% for 8 threads when many servers are getting close to their maxconn.	2020-10-17 19:49:49 +02:00
Willy Tarreau	58bc9c1ced	MINOR: lb/leastconn: only take a read lock in fwlc_get_next_server() This function doesn't change the tree, it only looks for the first usable server, so let's do that under a read lock to limit the situations like the ones described in issue #881 where finding a usable server when dealing with lots of saturated ones can be expensive. At least threads will now be able to look up in parallel. It's interesting to note that s->served is not incremented during the server choice, nor is the server repositionned. So right now already, nothing prevents multiple threads from picking the same server. This will not cause a significant imbalance anyway given that the server will automatically be repositionned at the right place, but this might be something to improve in the future if it doesn't come with too high a cost. It also looks like the way a server's weight is updated could be revisited so that the write lock gets tighter at the expense of a short part of inconsistency between weights and servers still present in the tree.	2020-10-17 19:37:40 +02:00
Willy Tarreau	ae99aeb135	MINOR: lb/map: use seek lock and read locks where appropriate - map_get_server_hash() doesn't need a write lock since it only reads the array, let's only use a read lock here. - map_get_server_rr() only needs exclusivity to adjust the rr_idx while looking for its entry. Since this one is not used by map_get_server_hash(), let's turn this lock to a seek lock that doesn't block reads. With 8 threads, no significant performance difference was noticed given that lookups are usually instant with this LB algo so the lock contention is rare.	2020-10-17 19:04:27 +02:00
Willy Tarreau	cd10def825	MINOR: backend: replace the lbprm lock with an rwlock It was previously a spinlock, and it happens that a number of LB algos only lock it for lookups, without performing any modification. Let's first turn it to an rwlock and w-lock it everywhere. This is strictly identical. It was carefully checked that every HA_SPIN_LOCK() was turned to HA_RWLOCK_WRLOCK() and that HA_SPIN_UNLOCK() was turned to HA_RWLOCK_WRUNLOCK() on this lock. _INIT and _DESTROY were updated too.	2020-10-17 18:51:41 +02:00
Christopher Faulet	26a52af642	BUG/MEDIUM: lb: Always lock the server when calling server_{take,drop}_conn The server lock must be held when server_take_conn() and server_drop_conn() lbprm callback functions are called. It is a documented prerequisite but it is not always performed. It only affects leastconn and fas lb algorithm. Others don't use these callback functions. A race condition on the next pending effecive weight (next_eweight) may be encountered with the leastconn lb algorithm. An agent check may set it to 0 while fwlc_srv_reposition() is called. The server is locked during the next_eweight update. But because the server lock is not acquired when fwlc_srv_reposition() is called, we may use it to recompute the server key, leading to a division by 0. This patch must be backported as far as 1.8.	2020-10-17 09:29:43 +02:00
Christopher Faulet	db2c17da60	BUG/MEDIUM: mux-h1: Get the session from the H1S when capturing bad messages It is not guaranteed that the backend connection has an owner. It is set when the connection is created. But when the connection is moved in a server idle list, the connection owner is set to NULL and may never be set again. On the other hand, when a mux is created or when a CS is attached, the session is always defined. The H1 stream always keep a reference on it when it is created. Thus, when a bad message is captured we should not rely on the connection owner to retrieve the session. Instead we should get it from the H1 stream.	2020-10-16 19:53:17 +02:00
Christopher Faulet	2469eba20f	BUG/MEDIUM: spoe: Unset variable instead of set it if no data provided If an agent try to set a variable with the NULL data type, an unset is perform instead to avoid undefined behaviors. Once decoded, such data are translated to a sample with the type SMP_T_ANY. It is unexpected in HAProxy. When a variable is set with such sample, no data are attached to the variable. Thus, when the variable is retrieved later in the transaction, the sample data are uninitialized, leading to undefined behaviors depending on how it is used. For instance, it leads to a crash if the debug converter is used on such variable. This patch should fix the issue #855. It must be backported as far as 1.8.	2020-10-16 19:53:17 +02:00
Amaury Denoyelle	7239c24986	MEDIUM: backend: reuse connection if using a static sni Detect if the sni used a constant value and if so, allow to reuse this connection for later sessions. Use a combination of SMP_USE_INTRN + !SMP_F_VOLATILE to consider a sample as a constant value. This features has been requested on github issue #371.	2020-10-16 17:48:01 +02:00
Amaury Denoyelle	2f0a797631	MINOR: ssl: add volatile flags to ssl samples The ssl samples are not constant over time and change according to the session. Add the flag SMP_F_VOL_SESS to indicate this.	2020-10-16 17:47:29 +02:00
Fr�d�ric L�caille	baeb919177	BUG/MINOR: peers: Possible unexpected peer seesion reset after collisions. During a peers session collision (two peer sessions opened on both side) we must mark the peer the session of which will be shutdown as alive, if not ->reconnect timer will be set with a wrong value if the synchro task expires after the peer has been reconnected. This possibly leads to unexpected deconnections during handshakes. Furthermore, this patch cancels any heartbeat tranmimission when a reconnection is prepared.	2020-10-16 17:45:58 +02:00
Willy Tarreau	0aa5a5b175	BUILD: listener: avoir a build warning when threads are disabled It's just a __decl_thread() that appeared before the last variable.	2020-10-16 17:43:04 +02:00
Willy Tarreau	d48ed6643b	MEDIUM: task: use an upgradable seek lock when scanning the wait queue Right now when running a configuration with many global timers (e.g. many health checks), there is a lot of contention on the global wait queue lock because all threads queue up in front of it to scan it. With 2000 servers checked every 10 milliseconds (200k checks per second), after 23 seconds running on 8 threads, the lock stats were this high: Stats about Lock TASK_WQ: write lock : 9872564 write unlock: 9872564 (0) wait time for write : 9208.409 msec wait time for write/lock: 932.727 nsec read lock : 240367 read unlock : 240367 (0) wait time for read : 149.025 msec wait time for read/lock : 619.991 nsec i.e. ~5% of the total runtime spent waiting on this specific lock. With upgradable locks we don't need to work like this anymore. We can just try to upgade the read lock to a seek lock before scanning the queue, then upgrade the seek lock to a write lock for each element we want to delete there and immediately downgrade it to a seek lock. The benefit is double: - all other threads which need to call next_expired_task() before polling won't wait anymore since the seek lock is compatible with the read lock ; - all other threads competing on trying to grab this lock will fail on the upgrade attempt from read to seek, and will let the current lock owner finish collecting expired entries. Doing only this has reduced the wake_expired_tasks() CPU usage in a very large servers test from 2.15% to 1.04% as reported by perf top, and increased by 3% the health check rate (all threads being saturated). This is expected to help against (and possibly solve) the problem described in issue #875.	2020-10-16 17:15:54 +02:00
Willy Tarreau	3cfaa8d1e0	BUG/MEDIUM: task: bound the number of tasks picked from the wait queue at once There is a theorical problem in the wait queue, which is that with many threads, one could spend a lot of time looping on the newly expired tasks, causing a lot of contention on the global wq_lock and on the global rq_lock. This initially sounds bening, but if another thread does just a task_schedule() or task_queue(), it might end up waiting for a long time on this lock, and this wait time will count on its execution budget, degrading the end user's experience and possibly risking to trigger the watchdog if that lasts too long. The simplest (and backportable) solution here consists in bounding the number of expired tasks that may be picked from the global wait queue at once by a thread, given that all other ones will do it as well anyway. We don't need to pick more than global.tune.runqueue_depth tasks at once as we won't process more, so this counter is updated for both the local and the global queues: threads with more local expired tasks will pick less global tasks and conversely, keeping the load balanced between all threads. This will guarantee a much lower latency if/when wakeup storms happen (e.g. hundreds of thousands of synchronized health checks). Note that some crashes have been witnessed with 1/4 of the threads in wake_expired_tasks() and, while the issue might or might not be related, not having reasonable bounds here definitely justifies why we can spend so much time there. This patch should be backported, probably as far as 2.0 (maybe with some adaptations).	2020-10-16 15:18:48 +02:00
Willy Tarreau	ba29687bc1	BUG/MEDIUM: proxy: properly stop backends The proxy stopping mechanism was changed with commit `322b9b94e` ("MEDIUM: proxy: make stop_proxy() now use stop_listener()") so that it's now entirely driven by the listeners. One thing was forgotten though, which is that pure backends will not stop anymore since they don't have any listener, and that it's necessary to stop them in order to stop the health checks. No backport is needed.	2020-10-16 15:16:17 +02:00
Willy Tarreau	233ad288cd	CLEANUP: protocol: remove the now unused <handler> field of proto_fam->bind() We don't need to specify the handler anymore since it's set in the receiver. Let's remove this argument from the function and clean up the remains of code that were still setting it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	a74cb38e7c	MINOR: protocol: register the receiver's I/O handler and not the protocol's Now we define a new sock_accept_iocb() for socket-based stream protocols and use it as a wrapper for listener_accept() which now takes a listener and not an FD anymore. This will allow the receiver's I/O cb to be redefined during registration, and more specifically to get rid of the hard-coded hacks in protocol_bind_all() made for syslog. The previous ->accept() callback in the protocol was removed since it doesn't have anything to do with accept() anymore but is more generic. A few places where listener_accept() was compared against the FD's IO callback for debugging purposes on the CLI were updated.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e140a6921f	MINOR: log: set the UDP receiver's I/O handler in the receiver The I/O handler is syslog_fd_handler(), let's set it when creating the receivers.	2020-10-15 21:47:56 +02:00
Willy Tarreau	d2fb99f9d5	MINOR: protocol: add a default I/O callback and put it into the receiver For now we're still using the protocol's default accept() function as the I/O callback registered by the receiver into the poller. While this is usable for most TCP connections where a listener is needed, this is not suitable for UDP where a different handler is needed. Let's make this configurable in the receiver just like the upper layer is configurable for listeners. In order to ease stream protocols handling, the protocols will now provide a default I/O callback which will be preset into the receivers upon allocation so that almost none of them has to deal with it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	caa91de718	MEDIUM: listener: remove the second pass of fd manipulation at the end The receiver FDs must not be manipulated by the listener_accept() function anymore, it must exclusively rely on the job performed by its listeners, as it is also the only way to keep the receivers working for established connections regardless of the listener's state (typically for multiplexed protocols like QUIC). This used to be necessary when the FDs were adjusted at once only but now that fd_done() is gone and the need for polling enabled by the accept_conn() function which detects the EAGAIN, we have nothing to do there to fixup any possible previous bad decision anymore. Interestingly, as a side effect of making the code not depend on the FD anymore, it also removes the need for a second lock, which increase the accept rate by about 1% on 8 threads.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9378bbe0be	MEDIUM: listener: use protocol->accept_conn() to accept a connection Now listener_accept() doesn't have to deal with the incoming FD anymore (except for a little bit of side band stuff). It directly retrieves a valid connection from the protocol layer, or receives a well-defined error code that helps it decide how to proceed. This removes a lot of hardly maintainable low-level code and opens the function to receive new protocol stacks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	344b8fcf87	MINOR: sockpair: implement sockpair_accept_conn() to accept a connection This is the same as previous commit, but this time for the sockpair- specific stuff, relying on recv_fd_uxst() instead of accept(), so the code is simpler. The various errno cases are handled like for regular sockets, though some of them will probably never happen, but this does not hurt.	2020-10-15 21:47:56 +02:00
Willy Tarreau	f1dc9f2f17	MINOR: sock: implement sock_accept_conn() to accept a connection The socket-specific accept() code in listener_accept() has nothing to do there. Let's move it to sock.c where it can be significantly cleaned up. It will now directly return an accepted connection and provide a status code instead of letting listener_accept() deal with various errno values. Note that this doesn't support the sockpair specific code. The function is now responsible for dealing with its own receiver's polling state and calling fd_cant_recv() when facing EAGAIN. One tiny change from the previous implementation is that the connection's sockaddr is now allocated before trying accept(), which saves a memcpy() of the resulting address for each accept at the expense of a cheap pool_alloc/pool_free on the final accept returning EAGAIN. This still apparently slightly improves accept performance in microbencharks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	7d053e4211	MINOR: sock: rename sock_accept_conn() to sock_accepting_conn() This call was introduced by commit `5ced3e887` ("MINOR: sock: add sock_accept_conn() to test a listening socket") but is actually quite confusing because it makes one think the socket will accept a connection (which is what we want to have in a new function) while it only tells whether it's configured to accept connections. Let's call it sock_accepting_conn() instead. The same change was applied to sockpair which had the same issue.	2020-10-15 21:47:56 +02:00
Willy Tarreau	01ca149047	MINOR: session: simplify error path in session_accept_fd() Now that this function is always called with an initialized connection and that the control layer is always initialized, we don't need to play games with fdtab[] to decide how to close, we can simply rely on the regular close path using conn_ctrl_close(), which can be fused with conn_xprt_close() into conn_full_close(). The code is cleaner because the FD is now used only for some protocol-specific setup (that will eventually have to move) and to try to send a hard-coded HTTP 500 error message on raw sockets.	2020-10-15 21:47:56 +02:00
Willy Tarreau	83efc320aa	MEDIUM: listener: allocate the connection before queuing a new connection Till now we would keep a per-thread queue of pending incoming connections for which we would store: - the listener - the accepted FD - the source address - the source address' length And these elements were first used in session_accept_fd() running on the target thread to allocate a connection and duplicate them again. Doing this induces various problems. The first one is that session_accept_fd() may only run on file descriptors and cannot be reused for QUIC. The second issue is that it induces lots of memory copies and that the listerner queue thrashes a lot of cache, consuming 64 bytes per entry. This patch changes this by allocating the connection before queueing it, and by only placing the connection's pointer into the queue. Indeed, the first two calls used to initialize the connection already store all the information above, which can be retrieved from the connection pointer alone. So we just have to pop one pointer from the target thread, and pass it to session_accept_fd() which only needs the FD for the final settings. This starts to make the accept path a bit more transport-agnostic, and saves memory and CPU cycles at the same time (1% connection rate increase was noticed with 4 threads). Thanks to dividing the accept-queue entry size from 64 to 8 bytes, its size could be increased from 256 to 1024 connections while still dividing the overall size by two. No single queue full condition was met. One minor drawback is that connection may be allocated from one thread's pool to be used into another one. But this already happens a lot with connection reuse so there is really nothing new here.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9b7587a6af	MINOR: connection: make sockaddr_alloc() take the address to be copied Roughly half of the calls to sockadr_alloc() are made to copy an already known address. Let's optionally pass it in argument so that the function can handle the copy at the same time, this slightly simplifies its usage.	2020-10-15 21:47:56 +02:00
Willy Tarreau	0138f51f93	CLEANUP: fd: finally get rid of fd_done_recv() fd_done_recv() used to be useful with the FD cache because it used to allow to keep a file descriptor active in the poller without being marked as ready in the cache, saving it from ringing immediately, without incurring any system call. It was a way to make it yield to wait for new events leaving a bit of time for others. The only user left was the connection accepter (listen_accept()). We used to suspect that with the FD cache removal it had become totally useless since changing its readiness or not wouldn't change its status regarding the poller itself, which would be the only one deciding to report it again. Careful tests showed that it indeed has exactly zero effect nowadays, the syscall numbers are exactly the same with and without, including when enabling edge-triggered polling. Given that there's no more API available to manipulate it and that it was directly called as an optimization from listener_accept(), it's about time to remove it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e53e7ec9d9	CLEANUP: protocol: remove the ->drain() function No protocol defines it anymore. The last user used to be the monitor-net stuff that got partially broken already when the tcp_drain() function moved to conn_sock_drain() with commit `e215bba95` ("MINOR: connection: make conn_sock_drain() work for all socket families") in 1.9-dev2. A part of this will surely move back later when non-socket connections arrive with QUIC but better keep the API clean and implement what's needed in time instead.	2020-10-15 21:47:04 +02:00
Willy Tarreau	9e9919dd8b	MEDIUM: proxy: remove obsolete "monitor-net" As discussed here during 2.1-dev, "monitor-net" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200 if ..." and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Willy Tarreau	77e0daef9f	MEDIUM: proxy: remove obsolete "mode health" As discussed here during 2.1-dev, "mode health" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, doesn't support source filtering, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200" and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00

... 2 3 4 5 6 ...

10520 Commits