haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-11 01:26:58 +02:00

Author	SHA1	Message	Date
Thierry Fournier	59f11be436	MEDIUM: lua-thread: Add the lua-load-per-thread directive The goal is to allow execution of one main lua state per thread. This patch contains the main job. The lua init is done using these steps: - "lua-load-per-thread" loads the lua code in the first thread - it creates the structs - it stores loaded files - the 1st step load is completed (execution of hlua_post_init) and now, we known the number of threads - we initilize lua states for all remaining threads - for each one, we load the lua file - for each one, we execute post-init Once all is loaded, we control consistency of functions references. The rules are: - a function reference cannot be in the shared lua state and in a per-thread lua state at the same time. - if a function reference is declared in a per-thread lua state, it must be declared in all per-thread lua states	2020-12-02 21:53:16 +01:00
Thierry Fournier	c749259dff	MINOR: lua-thread: Store each function reference and init reference in array The goal is to allow execution of one main lua state per thread. The array introduces storage of one reference per thread, because each lua state can have different reference id for a same function. A function returns the preferred state id according to configuration and current thread id.	2020-12-02 21:53:16 +01:00
Thierry Fournier	021d986ecc	MINOR: lua-thread: Replace state_from by state_id The goal is to allow execution of one main lua state per thread. "state_from" is a pointer to the parent lua state. "state_id" is the index of the parent state id in the reference lua states array. "state_id" is better because the lock is a "== 0" test which is quick than pointer comparison. In other way, the state_id index could index other things the the Lua state concerned. I think to the function references.	2020-12-02 21:53:16 +01:00
Thierry Fournier	62a22aa23f	MINOR: lua-thread: Replace "struct hlua_function" allocation by dedicated function The goal is to allow execution of one main lua state per thread. This function will initialize the struct with other things than 0. With this function helper, the initialization is centralized and it prevents mistakes. This patch also keeps a reference to each declared function in a list. It will be useful in next patches to control consistency of declared references.	2020-12-02 21:53:16 +01:00
Thierry Fournier	afc63e2cb1	MINOR: lua-thread: Replace global gL var with an array of states The goal is to allow execution of one main lua state per thread. The array of states is initialized at the max number of thread +1. We define the index 0 is the common state shared by all threads and should be locked. Other index index are dedicated to each one thread. The old gL now becomes hlua_states[0].	2020-12-02 21:53:16 +01:00
Thierry Fournier	7cbe5046e8	MEDIUM: lua-thread: Apply lock only if the parent state is the main thread The goal is to allow execution of one main lua state per thread. This patch opens the way to addition of a per-thread dedicated lua state. By passing the hlua we can figure the original state that's been used and decide to lock or not.	2020-12-02 21:53:16 +01:00
Thierry Fournier	3c539327f4	MEDIUM: lua-thread: No longer use locked context in initialization parts The goal is to allow execution of one main lua state per thread. Stop using locks in init part, we will use only in parts where the parent lua state is known, so we could take decision about lock according with the lua parent state.	2020-12-02 21:53:16 +01:00
Thierry Fournier	ecb83c24c4	MINOR: lua-thread: Add the "thread" core variable The goal is to allow execution of one main lua state per thread. This commit introduces this variable in the core. Lua state initialized by thread will have access to this variable, which reports the executing thread. 0 indicates the shared thread. Programs which must be executed only once can check for core.thread <= 1.	2020-12-02 21:53:16 +01:00
Thierry Fournier	b8cef175bd	MINOR: lua-thread: Split hlua_post_init() function in two parts The goal is to allow execution of one main lua state per thread. This function will be called for each initialized lua state, so one per thread. The split transforms the lua state variable from global to local.	2020-12-02 21:53:16 +01:00
Thierry Fournier	c93c15cf8c	MINOR: lua-thread: Split hlua_load function in two parts The goal is to allow execution of one main lua state per thread. This function will be called once per thread, using different Lua states. This patch prepares the work.	2020-12-02 21:53:16 +01:00
Thierry Fournier	75fc02956b	MINOR: lua-thread: make hlua_ctx_init() get L from its caller The goal is to allow execution of one main lua state per thread. The function hlua_ctx_init() now gets the original lua state from its caller. This allows the initialisation of lua_thread (coroutines) from any master lua state. The parent lua state is stored in the hlua struct. This patch is a temporary transition, it will be modified later.	2020-12-02 21:53:16 +01:00
Thierry Fournier	1eac28f5fc	MINOR: lua-thread: Split hlua_init() function in two parts The goal is to allow execution of one main lua state per thread. This is a preparative work in order to init more than one stack in the lua-thread objective.	2020-12-02 21:53:16 +01:00
Thierry Fournier	ad5345fed7	MINOR: lua-thread: Replace embedded struct hlua_function by a pointer The goal is to allow execution of one main lua state per thread. Because this struct will be filled after the configuration parser, we cannot copy the content. The actual state of the Haproxy code doesn't justify this change, it is an update preparing next steps.	2020-12-02 21:53:16 +01:00
Thierry Fournier	92689e651e	MINOR: lua-thread: Stop usage of struct hlua for the global lua state The goal is to no longer use "struct hlua" with global main lua_state. The usage of the "struct hlua" is no longer required. This patch replaces this struct by another one. Now, the usage of runtime Lua phase is separated from the start lua phase.	2020-12-02 21:53:16 +01:00
Thierry Fournier	4234dbd03b	MINOR: lua-thread: Use NULL context for main lua state The goal is to no longer use "struct hlua" with global main lua_state. This patch returns NULL value when some code tries go get the hlua struct associated with a task through hlua_gethlua(). This functions is useful only during runtime because the struct hlua contains only runtime states. Some Lua functions allowed to yield are called from init environment. I'm not sure this is a good practice. Maybe it will be clever to disallow calling this kind of functions.	2020-12-02 21:53:16 +01:00
Thierry Fournier	9eb3230b7c	MINOR: lua-thread: hlua_ctx_renew() is never called with main gL lua state The goal is no longer using "struct hlua" with global main lua_state. if somewhere in the code, hlua_ctx_renew() is called with a global Lua context, we have a serious bug. A crash is better than working with this bug, so this patch remove a useless control. In other way, this control were used during hlua_post_init() function. The function hlua_post_init() used a call to the runtime hlua_ctx_resume() function. This call no longer exists.	2020-12-02 21:53:16 +01:00
Thierry Fournier	670db24329	MEDIUM: lua-thread: make hlua_post_init() no longer use the runtime execution function The goal is to no longer use "struct hlua" with global main lua_state. The hlua_post_init() is executed during start phase, it does not require yielding nor any advanced runtime error processing. Let's simplify this by re-implementing the code using lower-level functions which directly take a state and not an hlua anymore.	2020-12-02 21:53:16 +01:00
Thierry Fournier	3fb9e5133a	MINOR: lua-thread: remove struct hlua from function hlua_prepend_path() The goal is to no longer use "struct hlua" with global main lua_state and directly take the state instead. This patch removes the implicit dependency to this struct with the function hlua_prepend_path()	2020-12-02 21:53:16 +01:00
Willy Tarreau	cdb53465f4	MEDIUM: lua-thread: use atomics for memory accounting Let's switch memory accounting to atomics so that the allocator function may safely be used from concurrent Lua states. Given that this function is extremely hot on the call path, we try to optimize it for the most common case, which is: - no limit - there's enough memory The accounting is what is particuarly expensive in threads since all CPUs compete for a cache line, so when the limit is not used, we don't want to use accounting. However we need to preserve it during the boot phase until we may parse a "tune.lua.maxmem" value. For this, we turn the unlimited "0" value to ~0 at the end of the boot phase to mark the definite end of accounting. The function then detects this value and directly jumps to realloc() in this case. When the limit is enforced however, we use a CAS to check and reserve our share of memory, and we roll back on failure. The CAS is used both for increments and decrements so that a single operation is enough to update the counters.	2020-12-02 21:53:16 +01:00
Willy Tarreau	d36c7fa5ec	MINOR: lua: simplify hlua_alloc() to only rely on realloc() The function really has the semantics of a realloc() except that it also passes the old size to help with accounting. No need to special case the free or malloc, realloc does everything we need.	2020-12-02 21:53:16 +01:00
Emeric Brun	fdabf49548	BUG/MAJOR: ring: tcp forward on ring can break the reader counter. If the session is not established, the applet handler could leave with the applet detached from the ring. At next call, the attach counter will be decreased again causing unpredectable behavior. This patch should be backported on branches >=2.2	2020-12-02 20:17:19 +01:00
Fr�d�ric L�caille	fd1831499e	BUG/MINOR: trace: Wrong displayed trace level With commit `a1f12746b` ("MINOR: traces: add a new level "error" below the "user" level") a new trace level was inserted, resulting in shifting all exiting ones by one. But the levels reported in the __trace() function were not updated accordingly, resulting in the TRACE_LEVEL_DEVELOPER not to be properly reported anymore. This patch fixes it by extending the number of levels to 6. No backport is needed.	2020-12-02 17:44:40 +01:00
Remi Tricot-Le Breton	3243447f83	MINOR: cache: Add entry to the tree as soon as possible When many concurrent requests targeting the same resource were seen, the cache could sometimes be filled by too many partial responses resulting in the impossibility to cache a single one of them. This happened because the actual tree insertion happened only after all the payload of every response was seen. So until then, every response was added to the cache because none of the streams knew that a similar request/response was already being treated. This patch consists in adding the cache_entry as soon as possible in the tree (right after the first packet) so that the other responses do not get cached as well (if they have the same primary key). A "complete" flag is also added to the cache_entry so that we know if all the payload is already stored in the entry or if it is still being processed.	2020-12-02 16:38:42 +01:00
Remi Tricot-Le Breton	8bb72aa82f	MINOR: cache: Improve accept_encoding_normalizer Turn the "Accept-Encoding" value to lower case before processing it. Calculate the CRC on every token instead of a sorted concatenation of them all (in order to avoir copying them) then XOR all the CRCs into a single hash (while ignoring duplicates).	2020-12-02 16:32:54 +01:00
Thierry Fournier	f67442efdb	BUG/MINOR: lua: warn when registering action, conv, sf, cli or applet multiple times Lua allows registering multiple sample-fetches, converters, action, cli, applet/services with the same name. This is absolutely useless since only the first registration will be used. This patch sends a warning if the case is encountered. This pach could be backported until 1.8, with the 3 associated patches: - MINOR: actions: Export actions lookup functions - MINOR: actions: add a function returning a service pointer from its name - MINOR: cli: add a function to look up a CLI service description	2020-12-02 09:45:18 +01:00
Thierry Fournier	a51a1fd174	MINOR: cli: add a function to look up a CLI service description This function will be useful to check if the keyword is already registered. Also add a define for the max number of args. This will be needed by a next patch to fix a bug and will have to be backported.	2020-12-02 09:45:18 +01:00
Thierry Fournier	87e539906b	MINOR: actions: add a function returning a service pointer from its name This function simply calls action_lookup() on the private service_keywords, to look up a service name. This will be used to detect double registration of a same service from Lua. This will be needed by a next patch to fix a bug and will have to be backported.	2020-12-02 09:45:18 +01:00
Thierry Fournier	7a71a6d9d2	MINOR: actions: Export actions lookup functions These functions will be useful to check if a keyword is already registered. This will be needed by a next patch to fix a bug, and will need to be backported.	2020-12-02 09:45:18 +01:00
Thierry Fournier	2f05cc6f86	BUG/MINOR: lua: Some lua init operation are processed unsafe Operation luaL_openlibs() and lua_prepend path are processed whithout the safe context, so in case of failure Haproxy aborts or stops without error message. This patch could be backported until 1.8	2020-12-02 09:45:18 +01:00
Thierry Fournier	13d08b73eb	BUG/MINOR: lua: Post init register function are not executed beyond the first one Just because if the first init is a success we return success in place of continuing the loop. This patch could be backported until 1.8	2020-12-02 09:45:18 +01:00
Thierry Fournier	77a88943d6	BUG/MINOR: lua: lua-load doesn't check its parameters "lua-load" doesn't check if the expected parameter is present. It tries to open() directly the argument at second position. So if the filename is omitted, it tries to load an empty filename. This patch could be backported until 1.8	2020-12-02 09:42:43 +01:00
Thierry Fournier	de6145f747	BUG/MINOR: lua: missing "\n" in error message Just replace ".n" by "\n" This could be backported until 1.9, but it is not so important.	2020-12-02 09:31:33 +01:00
Willy Tarreau	f965b2ad13	BUG/MINOR: mux-h2/stats: not all GOAWAY frames are errors The stats on haproxy.org reported ~12k GOAWAY for ~34k connections, with only 2 protocol errorss. It turns out that the GOAWAY frame counter added in commit `a8879238c` ("MINOR: mux-h2: report detected error on stats") matches a bit too many situations. First it counts those which are not sent as well as failed retries, second it counts as errors the cases of attempts to cleanly close, while it's titled "GOAWAY sent on detected error". Let's address this by moving the counter up one line and excluding the clean codes. This can be backported to 2.3.	2020-12-01 10:47:18 +01:00
Willy Tarreau	5dd36ac8a0	MINOR: mux-h2/trace: add traces at level ERROR for protocol errors A number of traces could be added, and a few TRACE_PROTO were replaced with TRACE_ERROR. The goal is to be able to enable error tracing only to detect anomalies. It looks like they're mostly correct as they don't seem to strike on valid H2 traffic but are very verbose on h2spec.	2020-12-01 10:30:37 +01:00
Willy Tarreau	a1f12746b1	MINOR: traces: add a new level "error" below the "user" level Sometimes it would be nice to be able to only trace abnormal events such as protocol errors. Let's add a new "error" level below the "user" level for this. This will allow to add TRACE_ERROR() at various error points and only see them.	2020-12-01 10:25:20 +01:00
Willy Tarreau	a307528fe2	BUG/MINOR: mux-h2/stats: make stream/connection proto errors more accurate Since commit `a8879238c` ("MINOR: mux-h2: report detected error on stats") we now have some error stats on stream/connection level protocol errors, but some were improperly marked as stream while they're connection, and 2 or 3 relevant ones were missing and have now been added. This could be backported to 2.3.	2020-12-01 10:25:20 +01:00
Maciej Zdeb	fcdfd857b3	MINOR: log: Logging HTTP path only with %HPO This patch adds a new logging variable '%HPO' for logging HTTP path only (without query string) from relative or absolute URI. For example: log-format "hpo=%HPO hp=%HP hu=%HU hq=%HQ" GET /r/1 HTTP/1.1 => hpo=/r/1 hp=/r/1 hu=/r/1 hq= GET /r/2?q=2 HTTP/1.1 => hpo=/r/2 hp=/r/2 hu=/r/2?q=2 hq=?q=2 GET http://host/r/3 HTTP/1.1 => hpo=/r/3 hp=http://host/r/3 hu=http://host/r/3 hq= GET http://host/r/4?q=4 HTTP/1.1 => hpo=/r/4 hp=http://host/r/4 hu=http://host/r/4?q=4 hq=?q=4	2020-12-01 09:32:44 +01:00
Emeric Brun	0237c4e3f5	BUG/MEDIUM: local log format regression. Since 2.3 default local log format always adds hostame field. This behavior change was due to log/sink re-work, because according to rfc3164 the hostname field is mandatory. This patch re-introduce a legacy "local" format which is analog to rfc3164 but with hostname stripped. This is the new default if logs are generated by haproxy. To stay compliant with previous configurations, the option "log-send-hostname" acts as if the default format is switched to rfc3164. This patch addresses the github issue #963 This patch should be backported in branches >= 2.3.	2020-12-01 06:58:42 +01:00
Willy Tarreau	4d6c594998	BUG/MEDIUM: task: close a possible data race condition on a tasklet's list link In issue #958 Ashley Penney reported intermittent crashes on AWS's ARM nodes which would not happen on x86 nodes. After investigation it turned out that the Neoverse N1 CPU cores used in the Graviton2 CPU are much more aggressive than the usual Cortex A53/A72/A55 or any x86 regarding memory ordering. The issue that was triggered there is that if a tasklet_wakeup() call is made on a tasklet scheduled to run on a foreign thread and that tasklet is just being dequeued to be processed, there can be a race at two places: - if MT_LIST_TRY_ADDQ() happens between MT_LIST_BEHEAD() and LIST_SPLICE_END_DETACHED() if the tasklet is alone in the list, because the emptiness tests matches ; - if MT_LIST_TRY_ADDQ() happens during LIST_DEL_INIT() in run_tasks_from_lists(), then depending on how LIST_DEL_INIT() ends up being implemented, it may even corrupt the adjacent nodes while they're being reused for the in-tree storage. This issue was introduced in 2.2 when support for waking up remote tasklets was added. Initially the attachment of a tasklet to a list was enough to know its status and this used to be stable information. Now it's not sufficient to rely on this anymore, thus we need to use a different information. This patch solves this by adding a new task flag, TASK_IN_LIST, which is atomically set before attaching a tasklet to a list, and is only removed after the tasklet is detached from a list. It is checked by tasklet_wakeup_on() so that it may only be done while the tasklet is out of any list, and is cleared during the state switch when calling the tasklet. Note that the flag is not set for pure tasks as it's not needed. However this introduces a new special case: the function tasklet_remove_from_tasklet_list() needs to keep both states in sync and cannot check both the state and the attachment to a list at the same time. This function is already limited to being used by the thread owning the tasklet, so in this case the test remains reliable. However, just like its predecessors, this function is wrong by design and it should probably be replaced with a stricter one, a lazy one, or be totally removed (it's only used in checks to avoid calling a possibly scheduled event, and when freeing a tasklet). Regardless, for now the function exists so the flag is removed only if the deletion could be done, which covers all cases we're interested in regarding the insertion. This removal is safe against a concurrent tasklet_wakeup_on() since MT_LIST_DEL() guarantees the atomic test, and will ultimately clear the flag only if the task could be deleted, so the flag will always reflect the last state. This should be carefully be backported as far as 2.2 after some observation period. This patch depends on previous patch "MINOR: task: remove __tasklet_remove_from_tasklet_list()".	2020-11-30 18:17:59 +01:00
Willy Tarreau	2da4c316c2	MINOR: task: remove __tasklet_remove_from_tasklet_list() This function is only used at a single place directly within the scheduler in run_tasks_from_lists() and it really ought not be called by anything else, regardless of what its comment says. Let's delete it, move the two lines directly into the call place, and take this opportunity to factor the atomic decrement on tasks_run_queue. A comment was added on the remaining one tasklet_remove_from_tasklet_list() to mention the risks in using it.	2020-11-30 18:17:44 +01:00
Willy Tarreau	c309dbdd99	MINOR: task: perform atomic counter increments only once per wakeup In process_runnable_tasks(), we walk the run queue and pick tasks to insert them into the local list. And for each of these operations we perform a few increments, some of which are atomic, and they're even performed under the runqueue's lock. This is useless inside the loop, better do them at the end, since we don't use these values inside the loop and they're not used anywhere else either during this time. The only one is task_list_size which is accessed in parallel by other threads performing remote tasklet wakeups, but it's already approximative and is used to decide to get out of the loop when the limit is reached. So now we compute it first as an initial budget instead.	2020-11-30 18:17:44 +01:00
Willy Tarreau	a868c2920b	MINOR: task: remove tasklet_insert_into_tasklet_list() This function is only called at a single place and adds more confusion than it removes. It also makes one think it could be used outside of the scheduler while it must absolutely not. Let's just move its two lines to the call place, making the code more readable there. In addition this clearly shows that the preliminary LIST_INIT() is useless since the entry is immediately overwritten.	2020-11-30 18:17:44 +01:00
Willy Tarreau	8a069eb9a4	MINOR: debug: add a trivial PRNG for scheduler stress-tests Commit `a5a447984` ("MINOR: debug: add "debug dev sched" to stress the scheduler.") doesn't scale with threads because ha_random64() takes care of being totally thread-safe for use with UUIDs. We don't need this for the stress-testing functions, let's just implement a xorshift PRNG instead. On 8 threads the performance jumped from 230k ctx/s with 96% spent in ha_random64() to 14M ctx/s.	2020-11-30 17:07:32 +01:00
Willy Tarreau	a5a4479849	MINOR: debug: add "debug dev sched" to stress the scheduler. This command supports starting a bunch of tasks or tasklets, either on the current thread (mask=0), all (default), or any set, either single-threaded or multi-threaded, and possibly auto-scheduled. These tasks/tasklets will randomly pick another one to wake it up. The tasks only do it 50% of the time while tasklets always wake two tasks up, in order to achieve roughly 50% load (since the target might already be woken up).	2020-11-29 17:43:07 +01:00
Christopher Faulet	a9ffc41637	BUG/MINOR: http-fetch: Fix smp_fetch_body() when called from a health-check res.body may be called from a health-check. It is probably never used. But it is possibe. In such case, there is no channel. Thus we must not use it unconditionally to set the flag SMP_F_MAY_CHANGE on the smp. Now the condition test the channel first. In addtion, the flag is not set if the payload is fully received. This patch must be backported as far as 2.2.	2020-11-27 10:30:23 +01:00
Christopher Faulet	83662b5431	MINOR: tcpcheck: Add support of L7OKC on expect rules error-status argument L7OKC may now be used as an error status for an HTTP/TCP expect rule. Thus it is for instance possible to write: option httpchk GET /isalive http-check expect status 200,404 http-check expect status 200 error-status L7OKC It is more or less the same than the disable-on-404 option except that if a DOWN is up again but still replying a 404 will be set to NOLB state. While it will stay in DOWN state with the disable-on-404 option.	2020-11-27 10:30:23 +01:00
Christopher Faulet	1e527cbf53	MINOR: check: Always increment check health counter on CONPASS Regarding the health counter, a check finished with the CONDPASS result is now the same than with the PASSED result: The health counter is always incemented. Before, it was only performed is the health counter was not 0. There is no change for the disable-on-404 option because it is only evaluated for running or stopping servers. So with an health check counter greater than 0. But it will make possible to handle (STOPPED -> STOPPING) transition for servers.	2020-11-27 10:30:23 +01:00
Christopher Faulet	97b7bdfcf7	REORG: tcpcheck: Move check option parsing functions based on tcp-check The parsing of the check options based on tcp-check rules (redis, spop, smtp, http...) are moved aways from check.c. Now, these functions are placed in tcpcheck.c. These functions are only related to the tcpcheck ruleset configured on a proxy and not to the health-check attached to a server.	2020-11-27 10:30:23 +01:00
Christopher Faulet	f8c869bac4	MINOR: config: Add a warning if tune.chksize is used This option is now deprecated. It is recent, but it is now marked as deprecated as far as 2.2. Thus, there is now a warning in the 2.4 if this option is still used. It will be removed in 2.5. Becaue the 2.3 is quite new, this patch may be backported to 2.3.	2020-11-27 10:30:23 +01:00
Christopher Faulet	bb9fb8b7f8	MINOR: config: Deprecate and ignore tune.chksize global option This option is now ignored because I/O check buffers are now allocated using the buffer pool. Thus, it is marked as deprecated in the documentation and ignored during the configuration parsing. The field is also removed from the global structure. Because this option is ignored since a recent fix, backported as fare as 2.2, this patch should be backported too. Especially because it updates the documentation.	2020-11-27 10:30:23 +01:00
Christopher Faulet	b1bb069c15	MINOR: tcpcheck: Don't handle anymore in-progress connect rules in tcpcheck_main The special handling of in-progress connect rules at the begining of tcpcheck_main() function can be removed. Instead, at the begining of the tcpcheck_eval_connect() function, we test is there is already an existing connection. In this case, it means we are waiting for a connection establishment. In addition, before evaluating a new connect rule, we take care to release any previous connection.	2020-11-27 10:29:41 +01:00
Christopher Faulet	b381a505c1	BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool Historically, the input and output buffers of a check are allocated by hand during the startup, with a specific size (not necessarily the same than other buffers). But since the recent refactoring of the checks to rely exclusively on the tcp-checks and to use the underlying mux layer, this part is totally buggy. Indeed, because these buffers are now passed to a mux, they maybe be swapped if a zero-copy is possible. In fact, for now it is only possible in h2_rcv_buf(). Thus the bug concretely only exists if a h2 health-check is performed. But, it is a latent bug for other muxes. Another problem is the size of these buffers. because it may differ for the other buffer size, it might be source of bugs. Finally, for configurations with hundreds of thousands of servers, having 2 buffers per check always allocated may be an issue. To fix the bug, we now allocate these buffers when required using the buffer pool. Thus not-running checks don't waste memory and muxes may swap them if possible. The only drawback is the check buffers have now always the same size than buffers used by the streams. This deprecates indirectly the "tune.chksize" global option. In addition, the http-check regtest have been update to perform some h2 health-checks. Many thanks to @VigneshSP94 for its help on this bug. This patch should solve the issue #936. It relies on the commit "MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main". Both must be backport as far as 2.2. bla	2020-11-27 10:29:41 +01:00
Christopher Faulet	39066c2738	MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main The special handling of in-progress send rules at the begining of tcpcheck_main() function can be removed. Instead, at the begining of the tcpcheck_eval_send() function, we test is there is some data in the output buffer. In this case, it means we are evaluating an unfinished send rule and we can jump to the sending part, skipping the formatting part. This patch is mandatory for a major fix on the checks and must be backported as far as 2.2.	2020-11-27 10:08:21 +01:00
Christopher Faulet	1faf18ae39	BUG/MINOR: tcpcheck: Don't forget to reset tcp-check flags on new kind of check When a new kind of check is found during the parsing of a proxy section (via an option directive), we must reset tcpcheck flags for this proxy. It is mandatory to not inherit some flags from a previously declared check (for instance in the default section). This patch must be backported as far as 2.2.	2020-11-27 10:08:18 +01:00
Willy Tarreau	5a7d6ebf2c	MINOR: fd/threads: silence a build warning with threads disabled Building with gcc-9.3.0 without threads may result in this warning: In file included from include/haproxy/api-t.h:36, from include/haproxy/api.h:33, from src/fd.c:90: src/fd.c: In function 'updt_fd_polling': include/haproxy/fd.h:507:11: warning: array subscript 63 is above array bounds of 'int[1]' [-Warray-bounds] 507 \| DISGUISE(write(poller_wr_pipe[tid], &c, 1)); include/haproxy/compiler.h:92:41: note: in definition of macro 'DISGUISE' 92 \| #define DISGUISE(v) ({ typeof(v) __v = (v); ALREADY_CHECKED(__v); __v; }) \| ^ src/fd.c:113:5: note: while referencing 'poller_wr_pipe' 113 \| int poller_wr_pipe[MAX_THREADS]; // Pipe to wake the threads \| ^~~~~~~~~~~~~~ gcc is wrong but this time it cannot be blamed because it doesn't know that the FD's thread_mask always has at least one bit set. Let's add the test for all_threads_mask there. It will also remove that test and drop the else block.	2020-11-26 22:28:41 +01:00
Willy Tarreau	345ebcfc01	BUG/MAJOR: peers: fix partial message decoding Another bug in the peers message parser was uncovered by last commit `1dfd4f106` ("BUG/MEDIUM: peers: fix decoding of multi-byte length in stick-table messages"): the function return on incomplete message does not check if the channel has a pending close before deciding to return 0. It did not hurt previously because the loop calling co_getblk() once per character would have depleted the buffer and hit the end, causing <0 to be returned and matching the condition. But now that we process at once what is available this cannot be relied on anymore and it's now clearly visible that the final check is missing. What happens when this strikes is that if a peer connection breaks in the middle of a message, the function will return 0 (missing data) but the caller doesn't check for the closed buffer, subscribes to reads, and the applet handler is immediately called again since some data are still available. This is detected by the loop prevention and the process dies complaining that an appctx is spinning. This patch simply adds the check for closed channel. It must be backported to the same versions as the fix above.	2020-11-26 17:12:47 +01:00
Tim Duesterhus	23b2945c1c	BUG/CRITICAL: cache: Fix trivial crash by sending accept-encoding header Since commit `3d08236cb3` HAProxy can be trivially crashed remotely by sending an `accept-encoding` HTTP request header that contains 16 commas. This is because the `values` array in `accept_encoding_normalizer` accepts only 16 entries and it is not verified whether the end is reached during looping. Fix this issue by checking the length. This patch also simplifies the ist processing in the loop, because it manually calculated offsets and lengths, when the ist API exposes perfectly safe functions to advance and truncate ists. I wonder whether the accept_encoding_normalizer function is able to re-use some existing function for parsing headers that may contain lists of values. I'll leave this evaluation up to someone else, only patching the obvious crash. This commit is 2.4-dev specific and was merged just a few hours ago. No backport needed.	2020-11-25 10:23:00 +01:00
Remi Tricot-Le Breton	754b2428d3	MINOR: cache: Add a process-vary option that can enable/disable Vary processing The cache section's process-vary option takes a 0 or 1 value to disable or enable the vary processing. When disabled, a response containing such a header will never be cached. When enabled, we will calculate a preliminary hash for a subset of request headers on all the incoming requests (which might come with a cpu cost) which will be used to build a secondary key for a given request (see RFC 7234#4.1). The default value is 0 (disabled).	2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton	1785f3dd96	MEDIUM: cache: Add the Vary header support Calculate a preliminary secondary key for every request we see so that we can have a real secondary key if the response is cacheable and contains a manageable Vary header. The cache's ebtree is now allowed to have multiple entries with the same primary key. Two of those entries will be distinguished thanks to secondary keys stored in the cache_entry (based on hashes of a subset of their headers). When looking for an entry in the cache (cache_use), we still use the primary key (built the same way as before), but in case of match, we also need to check if the entry has a vary signature. If it has one, we need to perform an extra check based on the newly built secondary key. We will only be able to forge a response out of the cache if both the primary and secondary keys match with one of our entries. Otherwise the request will be forwarder to the server.	2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton	3d08236cb3	MINOR: cache: Prepare helper functions for Vary support The Vary functionality is based on a secondary key that needs to be calculated for every request to which a server answers with a Vary header. The Vary header, which can only be found in server responses, determines which headers of the request need to be taken into account in the secondary key. Since we do not want to have to store all the headers of the request until we have the response, we will pre-calculate as many sub-hashes as there are headers that we want to manage in a Vary context. We will only focus on a subset of headers which are likely to be mentioned in a Vary response (accept-encoding and referer for now). Every managed header will have its own normalization function which is in charge of transforming the header value into a core representation, more robust to insignificant changes that could exist between multiple clients. For instance, two accept-encoding values mentioning the same encodings but in different orders should give the same hash. This patch adds a function that parses a Vary header value and checks if all the values belong to our supported subset. It also adds the normalization functions for our two headers, as well as utility functions that can prebuild a secondary key for a given request and transform it into an actual secondary key after the vary signature is determined from the response.	2020-11-24 16:52:57 +01:00
Christopher Faulet	401e6dbff3	BUG/MAJOR: filters: Always keep all offsets up to date during data filtering When at least one data filter is registered on a channel, the offsets of all filters must be kept up to date. For data filters but also for others. It is safer to do it in that way. Indirectly, this patch fixes 2 hidden bugs revealed by the commit `22fca1f2c` ("BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering"). The first one, the worst of both, happens at the end of http filtering when at least one data filtered is registered on the channel. We call the http_end() callback function on the filters, when defined, to finish the http filtering. But it is performed for all filters. Before the commit `22fca1f2c`, the only risk was to call the http_end() callback function unexpectedly on a filter. Now, we may have an overflow on the offset variable, used at the end to forward all filtered data. Of course, from the moment we forward an arbitrary huge amount of data, all kinds of bad things may happen. So offset computation is performed for all filters and http_end() callback function is called only for data filters. The other one happens when a data filter alter the data of a channel, it must update the offsets of all previous filters. But the offset of non-data filters must be up to date, otherwise, here too we may have an integer overflow. Another way to fix these bugs is to always ignore non-data filters from the offsets computation. But this patch is safer and probably easier to maintain. This patch must be backported in all versions where the above commit is. So as far as 2.0.	2020-11-24 14:17:32 +01:00
Maciej Zdeb	6dee9969b9	BUG/MEDIUM: http_act: Restore init of log-format list Restore init of log-format list in parse_http_del_header which was accidently deleted by commit `ebdd4c55da` (implementation of different header matching methods for http-request/response del-header). This is related to GitHub issue #909	2020-11-24 10:33:46 +01:00
Ilya Shipitsin	d9a16dc0f2	BUILD: SSL: add BoringSSL guarding to "RAND_keep_random_devices_open" "RAND_keep_random_devices_open" is OpenSSL specific, does not present in other OpenSSL variants like LibreSSL or BoringSSL. BoringSSL recently "updated" its internal openssl version to 1.1.1, we temporarily set it back to 1.1.0, as we are going to remove that hack, let us add proper guarding.	2020-11-24 09:54:44 +01:00
Julien Pivotto	2de240a676	MINOR: stream: Add level 7 retries on http error 401, 403 Level-7 retries are only possible with a restricted number of HTTP return codes. While it is usually not safe to retry on 401 and 403, I came up with an authentication backend which was not synchronizing authentication of users. While not perfect, being allowed to also retry on those return codes is really helpful and acts as a hotfix until we can fix the backend. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-11-23 09:33:14 +01:00
Tim Duesterhus	c8d19702f4	BUILD: Show the value of DEBUG= in haproxy -vv Previously this was not visible after building.	2020-11-21 18:27:33 +01:00
Maciej Zdeb	ebdd4c55da	MINOR: http_act: Add -m flag for del-header name matching method This patch adds -m flag which allows to specify header name matching method when deleting headers from http request/response. Currently beg, end, sub, str and reg are supported. This is related to GitHub issue #909	2020-11-21 15:54:30 +01:00
Maciej Zdeb	302b9f8d7a	BUG/MINOR: http_htx: Fix searching headers by substring Function __http_find_header is used to search headers by name using specified matching method. Matching by substring returned unexpected results due to wrong length of substring supplied to strnistr function. Fixed also the boolean condition by inverting it, as we're interested in headers that contains the substring. This patch should be backported as far as 2.2	2020-11-21 15:54:26 +01:00
Willy Tarreau	3aab17bd56	BUG/MAJOR: connection: reset conn->owner when detaching from session list Baptiste reported a new crash affecting 2.3 which can be triggered when using H2 on the backend, with http-reuse always and with a tens of clients doing close only. There are a few combined cases which cause this to happen, but each time the issue is the same, an already freed session is dereferenced in session_unown_conn(). Two cases were identified to cause this: - a connection referencing a session as its owner, which is detached from the session's list and is destroyed after this session ends. The test on conn->owner before calling session_unown_conn() is not sufficent as the pointer is not null but is not valid anymore. - a connection that never goes idle and that gets killed form the mux, where session_free() is called first, then conn_free() calls session_unown_conn() which scans the just freed session for older connections. This one is only triggered with DEBUG_UAF The reason for this session to be present here is that it's needed during the connection setup, to be passed to conn_install_mux_be() to mux->init() as the owning session, but it's never deleted aftrewards. Furthermore, even conn_session_free() doesn't delete this pointer after freeing the session that lies there. Both do definitely result in a use-after-free that's more easily triggered under DEBUG_UAF. This patch makes sure that the owner is always deleted after detaching or killing the session. However it is currently not possible to clear the owner right after a synchronous init because the proxy protocol apparently needs it (a reg test checks this), and if we leave it past the connection setup with the session not attached anywhere, it's hard to catch the right moment to detach it. This means that the session may remain in conn->owner as long as the connection has never been added to nor removed from the session's idle list. Given that this patch needs to remain simple enough to be backported, instead it adds a workaround in session_unown_conn() to detect that the element is already not attached anywhere. This fix absolutely requires previous patch "CLEANUP: connection: do not use conn->owner when the session is known" otherwise the situation will be even worse, as some places used to rely on conn->owner instead of the session. The fix could theorically be backported as far as 1.8. However, the code in this area has significantly changed along versions and there are more risks of breaking working stuff than fixing real issues there. The issue was really woken up in two steps during 2.3-dev when slightly reworking the idle conns with commit `08016ab82` ("MEDIUM: connection: Add private connections synchronously in session server list") and when adding support for storing used H2 connections in the session and adding the necessary call to session_unown_conn() in the muxes. But the same test managed to crash 2.2 when built in DEBUG_UAF and patched like this, proving that we used to already leave dangling pointers behind us: \| diff --git a/include/haproxy/connection.h b/include/haproxy/connection.h \| index f8f235c1a..dd30b5f80 100644 \| --- a/include/haproxy/connection.h \| +++ b/include/haproxy/connection.h \| @@ -458,6 +458,10 @@ static inline void conn_free(struct connection conn) \| sess->idle_conns--; \| session_unown_conn(sess, conn); \| } \| + else { \| + struct session sess = conn->owner; \| + BUG_ON(sess && sess->origin != &conn->obj_type); \| + } \| \| sockaddr_free(&conn->src); \| sockaddr_free(&conn->dst); It's uncertain whether an existing code path there can lead to dereferencing conn->owner when it's bad, though certain suspicious memory corruption bugs make one think it's a likely candidate. The patch should not be hard to adapt there. Backports to 2.1 and older are left to the appreciation of the person doing the backport. A reproducer consists in this: global nbthread 1 listen l bind :9000 mode http http-reuse always server s 127.0.0.1:8999 proto h2 frontend f bind :8999 proto h2 mode http http-request return status 200 Then this will make it crash within 2-3 seconds: $ h1load -e -r 1 -c 10 http://0:9000/ If it does not, it might be that DEBUG_UAF was not used (it's harder then) and it might be useful to restart.	2020-11-21 15:29:22 +01:00
Willy Tarreau	38b4d2eb22	CLEANUP: connection: do not use conn->owner when the session is known At a few places we used to rely on conn->owner to retrieve the session while the session is already known. This is not correct because at some of these points the reason the connection's owner was still the session (instead of NULL) is a mistake. At one place a comparison is even made between the session and conn->owner assuming it's valid without checking if it's NULL. Let's clean this up to use the session all the time. Note that this will be needed for a forthcoming fix and will have to be backported.	2020-11-21 15:29:22 +01:00
Ilya Shipitsin	f34ed0b74c	BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES HAVE_SSL_CTX_SET_CIPHERSUITES is newly defined macro set in openssl-compat.h, which helps to identify ssl libs (currently OpenSSL-1.1.1 only) that supports TLS13 cipersuites manipulation on TLS13 context	2020-11-21 11:04:36 +01:00
William Lallemand	77e1c6fb0a	BUG/MEDIUM: ssl/crt-list: fix error when no file found When a file from a crt-list was not found, this one was ignored silently letting HAProxy starts without it. This bug was introduced by `47da821` ("MEDIUM: ssl: emulates the multi-cert bundles in the crtlist"). This commit adds a found variable which is checked once we tried every bundle combination so we can exits with an error if none were found. Must be backported in 2.3.	2020-11-20 18:38:56 +01:00
William Lallemand	7340457158	BUG/MINOR: ssl/crt-list: load bundle in crt-list only if activated Don't try to load a bundle from a crt-list if the bundle support was disabled with ssl-load-extra-files. Must be backported to 2.3.	2020-11-20 18:38:56 +01:00
William Lallemand	06ce84a100	BUG/MEDIUM: ssl: error when no certificate are found When a non-existing file was specified in the configuration, haproxy does not exits with an error which is not normal. This bug was introduced by `dfa93be` ("MEDIUM: ssl: emulate multi-cert bundles loading in standard loading") which does nothing if the stat failed. This patch introduce a "found" variable which is checked at the end of the function so we exit with an error if no find were found. Must be backported to 2.3.	2020-11-20 18:38:56 +01:00
William Lallemand	86c2dd60f1	BUG/MEDIUM: ssl/crt-list: bundle support broken in crt-list In issue #970 it was reported that the bundle loading does not work anymore with crt-list. This bug was introduced by `47da821` ("MEDIUM: ssl: emulates the multi-cert bundles in the crtlist") which incorrectly uses "path" instead of "crt_path" in the name resolution. Must be backported to 2.3.	2020-11-20 18:38:51 +01:00
Christopher Faulet	aab1b67383	BUG/MEDIUM: http-ana: Don't eval http-after-response ruleset on empty messages It is not possible on response comming from a server, but an errorfile may be empty. In this case, the http-after-response ruleset must not be evaluated because it is totally unexpected to manipulate headers on an empty HTX message. This patch must be backported everywhere the http-after-response rules are supported, i.e as far as 2.2.	2020-11-20 09:43:31 +01:00
Ilya Shipitsin	bdec3ba796	BUILD: ssl: use SSL_MODE_ASYNC macro instead of OPENSSL_VERSION	2020-11-19 19:59:32 +01:00
William Lallemand	f69cd68737	BUG/MINOR: ssl: segv on startup when AKID but no keyid In bug #959 it was reported that haproxy segfault on startup when trying to load a certifcate which use the X509v3 AKID extension but without the keyid field. This field is not mandatory and could be replaced by the serial or the DirName. For example: X509v3 extensions: X509v3 Basic Constraints: CA:FALSE X509v3 Subject Key Identifier: 42:7D:5F:6C:3E:0D:B7:2C:FD:6A:8A:32:C6:C6:B9:90:05:D1:B2:9B X509v3 Authority Key Identifier: DirName:/O=HAProxy Technologies/CN=HAProxy Test Intermediate CA serial:F2:AB:C1:41:9F:AB:45:8E:86:23:AD:C5:54:ED:DF:FA This bug was introduced by 70df7b ("MINOR: ssl: add "issuers-chain-path" directive"). This patch must be backported as far as 2.2.	2020-11-19 16:24:13 +01:00
William Dauchy	f63704488e	MEDIUM: cli/ssl: configure ssl on server at runtime in the context of a progressive backend migration, we want to be able to activate SSL on outgoing connections to the server at runtime without reloading. This patch adds a `set server ssl` command; in order to allow that: - add `srv_use_ssl` to `show servers state` command for compatibility, also update associated parsing - when using default-server ssl setting, and `no-ssl` on server line, init SSL ctx without activating it - when triggering ssl API, de/activate SSL connections as requested - clean ongoing connections as it is done for addr/port changes, without checking prior server state example config: backend be_foo default-server ssl server srv0 127.0.0.1:6011 weight 1 no-ssl show servers state: 5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - -1 where srv0 can switch to ssl later during the runtime: set server be_foo/srv0 ssl on 5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - 1 Also update existing tests and create a new one. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-11-18 17:22:28 +01:00
William Dauchy	fc52f524b0	MINOR: ssl: create common ssl_ctx init a common init for ssl_ctx will be later usable in other functions in order to support hot enable of ssl during runtime. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-11-18 17:22:28 +01:00
Amaury Denoyelle	034c162b9b	MEDIUM: stats: add counters for failed handshake Report on ssl stats the total number of handshakes terminated in a failure.	2020-11-18 16:10:42 +01:00
Amaury Denoyelle	f70b7db825	MINOR: ssl: remove client hello counters Remove the ssl client hello received counter. This counter is not meaningful and was only implemented on the fronted.	2020-11-18 16:10:42 +01:00
Christopher Faulet	47d9a4e870	MINOR: flt-trace: Use a bitfield for the trace options Instead of using a integer for each option, we now use a bitfield. Each option is represented as a flag now.	2020-11-17 11:34:36 +01:00
Christopher Faulet	96a577acae	MINOR: flt-trace: Add an option to inhibits trace messages The 'quiet' option may be set to inibits the trace messages. The trace filter is a bit verbose. This option may be used to not display the messages.	2020-11-17 11:34:36 +01:00
Christopher Faulet	c41d8bd65a	CLEANUP: flt-trace: Remove unused random-parsing option This option was only used by the legacy HTTP mode. In HTX, it is not used. So it can be removed.	2020-11-17 11:34:30 +01:00
Christopher Faulet	63c69a9b4e	BUG/MINOR: http-ana: Don't wait for the body of CONNECT requests CONNECT requests are bodyless messages but with no EOM blocks. Thus, conditions to stop waiting for the message payload are not suited to this kind of messages. Indeed, the message finishes on an EOH block. But the tunnel mode at the stream level is only set in HTTP_XFER_BODY analyser. So, the stream is blocked, waiting for a body that does not exist till a timeout expires. To fix this bug, we just stop waiting for a body for CONNECT requests. Another solution is to rely on HTX_SL_F_BODYLESS/HTTP_MSGF_BODYLESS flags. But this one is less intrusive. This message must be backported as far as 2.0. For the 2.0, only the HTX part must be fixed.	2020-11-17 10:03:12 +01:00
Christopher Faulet	22fca1f2c8	BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering When http filtering ends, if there are some filtered data not forwarded yet, we forward them, in flt_http_end(). Most of time, this doesn't happen, except when a tunnel is established using a CONNECT. In this case, there is not EOM on the request and there is no body. Thus the headers are never forwarded, blocking the stream. This patch must be backported as far as 2.0. Prior versions don't suffer of this bug because there is no HTX support. On the 2.0, the change is only applicable on HTX streams. A special test must be performed to make sure.	2020-11-17 09:59:35 +01:00
Eric Salama	9139ec34ed	MINOR: cfgparse: tighten the scope of newnameserver variable, free it on error. This should fix issue GH #931. Also remove a misleading comment. This commit can be backported as far as 1.9	2020-11-13 16:26:10 +01:00
Christopher Faulet	fc633b6eff	CLEANUP: config: Return ERR_NONE from config callbacks instead of 0 Return ERR_NONE instead of 0 on success for all config callbacks that should return ERR_* codes. There is no change because ERR_NONE is a macro equals to 0. But this makes the return value more explicit.	2020-11-13 16:26:10 +01:00
Christopher Faulet	5214099233	MINOR: config/mux-h2: Return ERR_ flags from init_h2() instead of a status post-check function callbacks must return ERR_* flags. Thus, init_h2() is fixed to return ERR_NONE on success or (ERR_ALERT\|ERR_FATAL) on error. This patch may be backported as far as 2.2.	2020-11-13 16:26:10 +01:00
Christopher Faulet	83fefbcdff	MINOR: init: Fix the prototype for per-thread free callbacks Functions registered to release memory per-thread have no return value. But the registering function and the function pointer in per_thread_free_fct structure specify it should return an integer. This patch fixes it. This patch may be backported as far as 2.0.	2020-11-13 16:26:10 +01:00
Christopher Faulet	c751b4508d	BUG/MINOR: tcpcheck: Don't warn on unused rules if check option is after When tcp-check or http-check rules are used, if the corresponding check option (option tcp-check and option httpchk) is declared after the ruleset, a warning is emitted about an unused check ruleset while there is no problem in reality. This patch must be backported as far as 2.2.	2020-11-13 16:26:10 +01:00
Christopher Faulet	c7ba91039a	MINOR: spoe: Don't close connection in sync mode on processing timeout In sync mode, if an applet receives a ack while the processing delay has already expired, there is not frame waiting for this ack. But there is no reason to close the connection in this case. The ack may be ignored and the connection may be reused to process another frame. The only reason to trigger an error and close the connection is when the wrong ack is received while there is still a frame waiting for its ack. In sync mode, this should never happen. This patch may be backported in all versions supporting the SPOE.	2020-11-13 16:26:10 +01:00
Christopher Faulet	cf181c76e3	BUG/MAJOR: spoe: Be sure to remove all references on a released spoe applet When a SPOE applet is used to send a frame, a reference on this applet is saved in the spoe context of the offladed stream. But, if the applet is released before receving the corresponding ack, we must be sure to remove this reference. This was performed for fragmented frames only. But it must also be performed for a spoe contexts in the applet waiting_queue and in the thread waiting_queue (used in async mode). This bug leads to a memory corruption when an offloaded stream try to update the state of a released applet because it still have a reference on it. There are many ways to trigger this bug. The easiest is probably during reloads. On the old process, all applets are woken up to be released ASAP. Many thanks to Maciej Zdeb to report the bug and to work on it for 2 months. Without his help, it would have been much more difficult to fix the bug. It is always a huge pleasure to see how some users are enthousiast and helpful. Thanks again Maciej ! This patch must be backported to all versions where the spoe is supported (>= 1.7).	2020-11-13 16:26:10 +01:00
Christopher Faulet	3005d28eb8	BUG/MINOR: http-htx: Handle warnings when parsing http-error and http-errors First of all, this patch is tagged as a bug. But in fact, it only fixes a bug in the 2.2. On the 2.3 and above, it only add the ability to display warnings, when an http-error directive is parsed from a proxy section and when an errorfile directive is parsed from a http-errors section. But on the 2.2, it make sure to display the warning emitted on a content-length mismatch when an errorfile is parsed. The following is only applicable to the 2.2. commit "BUG/MINOR: http-htx: Just warn if payload of an errorfile doesn't match the C-L" (which is only present in 2.2, 2.1 and 2.0 trees, i.e see commit 7bf3d81d3cf4b9f4587 in 2.2 tree), is changing the behavior of `http_str_to_htx` function. It may now emit warnings. And, it is the caller responsibility to display it. But the warning is missing when an 'http-error' directive is parsed from a proxy section. It is also missing when an 'errorfile' directive is parsed from a http-errors section. This bug only exists on the 2.2. On earlier versions, these directives are not supported and on later ones, an error is triggered instead of a warning. Thanks to William Dauchy that spotted the bug. This patch must be backported as far as 2.2.	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	90eb93f792	MINOR: check: report error on incompatible connect proto Report an error when using an explicit proto for a connect rule with non-compatible mode in regards with the selected check type (tcp-check vs http-check).	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	7c14890183	MINOR: check: report error on incompatible proto If the check mux has been explicitly defined but is incompatible with the selected check type (tcp-check vs http-check), report a warning and prevent haproxy startup.	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	0519bd4d04	BUG/MEDIUM: check: reuse srv proto only if using same mode Only reuse the mux from server if the check is using the same mode. For example, this prevents a tcp-check on a h2 server to select the h2 multiplexer instead of passthrough. This bug was introduced by the following commit : BUG/MEDIUM: checks: Use the mux protocol specified on the server line It must be backported up to 2.2. Fixes github issue #945.	2020-11-13 16:26:10 +01:00
Christopher Faulet	97fc8da264	BUG/MINOR: http-fetch: Fix calls w/o parentheses of the cookie sample fetches req.cook, req.cook_val, req.cook_cnt and and their response counterparts may be called without cookie name. In this case, empty parentheses may be used, or no parentheses at all. In both, the result must be the same. But only the first one works. The second one always returns a failure. This patch fixes this bug. Note that on old versions (< 2.2), both cases fail. This patch must be backported in all stable versions.	2020-11-13 16:26:10 +01:00
Maciej Zdeb	dea7c209f8	BUG/MINOR: http-fetch: Extract cookie value even when no cookie name HTTP sample fetches dealing with the cookies (req/res.cook, req/res.cook_val and req/res.cook_cnt) must be prepared to be called without cookie name. For the first two, the first cookie value is returned, regardless its name. For the last one, all cookies are counted. To do so, http_extract_cookie_value() may now be called with no cookie name (cookie_name_l set to 0). In this case, the matching on the cookie name is ignored and the first value found is returned. Note this patch also fixes matching on cookie values in ACLs. This should be backported in all stable versions.	2020-11-13 16:26:10 +01:00
Willy Tarreau	1dfd4f106f	BUG/MEDIUM: peers: fix decoding of multi-byte length in stick-table messages There is a bug in peer_recv_msg() due to an incorrect cast when trying to decode the varint length of a stick-table message, causing lengths comprised between 128 and 255 to consume one extra byte, ending in protocol errors. The root cause of this is that peer_recv_msg() tries hard to reimplement all the parsing and control that is already done in intdecode() just to measure the length before calling it. And it got it wrong. Let's just get rid of this unneeded code duplication and solely rely on intdecode() instead. The bug was introduced in 2.0 as part of a cleanup pass on this code with commit `95203f218` ("MINOR: peers: Move high level receive code to reduce the size of I/O handler."), so this patch must be backported to 2.0. Thanks to Yves Lafon for reporting the problem.	2020-11-13 15:21:50 +01:00
Fr�d�ric L�caille	ea875e62e6	BUG/MINOR: peers: Missing TX cache entries reset. The TX part of a cache for a dictionary is made of an reserved array of ebtree nodes which are pointers to dictionary entries. So when we flush the TX part of such a cache, we must not only remove these nodes to dictionary entries from their ebtree. We must also reset their values. Furthermore, the LRU key and the last lookup result must also be reset.	2020-11-13 06:04:18 +01:00
Fr�d�ric L�caille	f9e51beec1	BUG/MINOR: peers: Do not ignore a protocol error for dictionary entries. If we could not decode the ID of a dictionary entry from a peer update message, we must inform the remote peer about such an error as this is done for any other decoding error.	2020-11-13 06:04:08 +01:00
Fr�d�ric L�caille	d865935f32	MINOR: peers: Add traces to peer_treat_updatemsg(). Add minimalistic traces for peers with only one event to diagnose potential issues when decode peer update messages.	2020-11-12 17:38:49 +01:00
Amaury Denoyelle	7f8f6cb926	BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one Define a per-thread counters allocated with the greatest size of any stat module counters. This variable is named trash_counters. When using a proxy without allocated counters, return the trash counters from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent segfault. This is useful for all the proxies used internally and not belonging to the global proxy list. As these objects does not appears on the stat report, it does not matter to use the dummy counters. For this fix to be functional, the extra counters are explicitly initialized to NULL on proxy/server/listener init functions. Most notably, the crash has already been detected with the following vtc: - reg-tests/lua/txn_get_priv.vtc - reg-tests/peers/tls_basic_sync.vtc - reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc There is probably other parts that may be impacted (SPOE for example). This bug was introduced in the current release and do not need to be backported. The faulty commits are "MINOR: ssl: count client hello for stats" and "MINOR: ssl: add counters for ssl sessions".	2020-11-12 15:16:05 +01:00
Amaury Denoyelle	a2a6899bee	BUG/MINOR: stats: free dynamically stats fields/lines on shutdown Register a new function on POST DEINIT to free stats fields/lines for each domain. This patch does not fix a critical bug but may be backported to 2.3.	2020-11-12 15:16:05 +01:00
Remi Tricot-Le Breton	cc9bf2e5fe	MEDIUM: cache: Change caching conditions Do not cache responses that do not have an explicit expiration time (s-maxage or max-age Cache-Control directives or Expires header) or a validator (ETag or Last-Modified headers) anymore, as suggested in RFC 7234#3. The TX_FLAG_IGNORE flag is used instead of the TX_FLAG_CACHEABLE so as not to change the behavior of the checkcache option.	2020-11-12 11:22:05 +01:00
Thierry Fournier	91dc0c0d8f	BUG/MINOR: lua: set buffer size during map lookups This size is used by some pattern matching to determine if there is sufficient room in the buffer to add final \0 if necessary. If the size is not set, the conditions use uninitialized value. Note: it seems this bug can't cause a crash. Should be backported until 2.2 (at least)	2020-11-11 10:43:21 +01:00
Thierry Fournier	a68affeaa9	BUG/MINOR: pattern: a sample marked as const could be written The functions add final 0 to string if the final 0 is not set, but don't check the flag CONST. This patch duplicates the strings if the final zero is not set and the string is CONST. Should be backported until 2.2 (at least)	2020-11-11 10:43:15 +01:00
William Lallemand	50c03aac04	BUG/MEDIUM: ssl/crt-list: correctly insert crt-list line if crt already loaded In issue #940, it was reported that the crt-list does not work correctly anymore. Indeed when inserting a crt-list line which use a certificate previously seen in the crt-list, this one won't be inserted in the SNI list and will be silently ignored. This bug was introduced by commit `47da821` "MEDIUM: ssl: emulates the multi-cert bundles in the crtlist". This patch also includes a reg-test which tests this issue. This bugfix must be backported in 2.3.	2020-11-06 16:39:39 +01:00
Willy Tarreau	431a12cafe	BUILD: http-htx: fix build warning regarding long type in printf Commit `a66adf41e` ("MINOR: http-htx: Add understandable errors for the errorfiles parsing") added a warning when loading malformed error files, but this warning may trigger another build warning due to the %lu format used. Let's simply cast it for output since it's just used for end user output. This must be backported to 2.0 like the commit above.	2020-11-06 14:24:02 +01:00
Willy Tarreau	4299528390	BUILD: ssl: silence build warning on uninitialised counters Since commit `d0447a7c3` ("MINOR: ssl: add counters for ssl sessions"), gcc 9+ complains about this: CC src/ssl_sock.o src/ssl_sock.c: In function 'ssl_sock_io_cb': src/ssl_sock.c:5416:3: warning: 'counters_px' may be used uninitialized in this function [-Wmaybe-uninitialized] 5416 \| ++counters_px->reused_sess; \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_sock.c:5133:23: note: 'counters_px' was declared here 5133 \| struct ssl_counters counters, counters_px; \| ^~~~~~~~~~~ Either a listener or a server are expected there, so ther counters are always initialized and the compiler cannot know this. Let's preset them and test before updating the counter, we're not in a hot path here. No backport is needed.	2020-11-06 13:22:44 +01:00
Willy Tarreau	f5fe70620c	MINOR: server: remove idle lock in srv_cleanup_connections This function used to grab the idle lock when scanning the threads for idle connections, but it doesn't need it since the lock only protects the tree. Let's remove it.	2020-11-06 13:22:44 +01:00
Amaury Denoyelle	d0447a7c3e	MINOR: ssl: add counters for ssl sessions Add counters for newly established and resumed sessions.	2020-11-06 12:05:17 +01:00
Amaury Denoyelle	fbc3377cd4	MINOR: ssl: count client hello for stats Add a counter for ssl client_hello received on frontends.	2020-11-06 12:05:17 +01:00
Amaury Denoyelle	9963fa74d2	MINOR: ssl: instantiate stats module This module is responsible for providing statistics for ssl. It allocates counters for frontend/backend/listener/server objects.	2020-11-06 12:05:17 +01:00
Christopher Faulet	a66adf41ea	MINOR: http-htx: Add understandable errors for the errorfiles parsing No details are provided when an error occurs during the parsing of an errorfile, Thus it is a bit hard to diagnose where the problem is. Now, when it happens, an understandable error message is reported. This patch is not a bug fix in itself. But it will be required to change an fatal error into a warning in last stable releases. Thus it must be backported as far as 2.0.	2020-11-06 09:13:58 +01:00
Willy Tarreau	6d27a92b83	BUG/MINOR: ssl: don't report 1024 bits DH param load error when it's higher The default dh_param value is 2048 and it's preset to zero unless explicitly set, so we must not report a warning about DH param not being loadble in 1024 bits when we're going to use 2048. Thanks to Dinko for reporting this. This should be backported to 2.2.	2020-11-05 19:40:14 +01:00
Jerome Magnin	eff2e0a958	CLEANUP: cfgparse: remove duplicate registration for transparent build options Since commit `37bafdcbb` ("MINOR: sock_inet: move the IPv4/v6 transparent mode code to sock_inet"), build options for transparent proxying are registered twice. This patch removes the older one.	2020-11-05 19:27:16 +01:00
Willy Tarreau	38d41996c1	MEDIUM: pattern: turn the pattern chaining to single-linked list It does not require heavy deletion from the expr anymore, so we can now turn this to a single-linked list since most of the time we want to delete all instances of a given pattern from the head. By doing so we save 32 bytes of memory per pattern. The pat_unlink_from_head() function was adjusted accordingly.	2020-11-05 19:27:09 +01:00
Willy Tarreau	867a8a5a10	MINOR: pattern: prepare removal of a pattern from the list head Instead of using LIST_DEL() on the pattern itself inside an expression, we look it up from its head. The goal is to get rid of the double-linked list while this usage remains exclusively for freeing on startup error!	2020-11-05 19:27:09 +01:00
Willy Tarreau	2817472bb0	MINOR: pattern: during reload, delete elements frem the ref, not the expression Instead of scanning all elements from the expression and using the slow delete path there, let's use the faster way which involves pat_delete_gen() while the elements are detached from ther reference.	2020-11-05 19:27:09 +01:00
Willy Tarreau	ae83e63b48	MEDIUM: pattern: make pat_ref_prune() rely on pat_ref_purge_older() When purging all of a reference, it's much more efficient to scan the reference patterns from the reference head and delete all derivative patterns than to scan the expressions. The only thing is that we need to proceed both for the current and next generations, in case there is a huge gap between the two. With this, purging 20M IP addresses in small batches of 100 takes roughly 3 seconds.	2020-11-05 19:27:09 +01:00
Willy Tarreau	94b9abe200	MINOR: pattern: add pat_ref_purge_older() to purge old entries This function will be usable to purge at most a specified number of old entries from a reference. Entries are declared old if their generation number is in the past compared to the one passed in argument. This will ease removal of early entries when new ones have been appended. We also call malloc_trim() when available, at the end of the series, because this is one place where there is a lot of memory to save. Reloads of 1M IP addresses used in an ACL made the process grow up to 1.7 GB RSS after 10 reloads and roughly stabilize there without this call, versus only 260 MB when the call is present. Sadly there is no direct equivalent for jemalloc, which stabilizes around 800MB-1GB.	2020-11-05 19:27:09 +01:00
Willy Tarreau	1a6857b9c1	MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation pat_ref_load() basically combines pat_ref_append() and pat_ref_commit(). It's very similar to pat_ref_add() except that it also allows to set the generation ID and the line number. pat_ref_add() was modified to directly rely on it to avoid code duplication. Note that a previous declaration of pat_ref_load() was removed as it was just a leftover of an earlier incarnation of something possibly similar, so no existing functionality was changed here.	2020-11-05 19:27:09 +01:00
Willy Tarreau	0439e5eeb4	MINOR: pattern: add pat_ref_commit() to commit a previously inserted element This function will be used after a successful pat_ref_append() to propagate the pattern to all use places (including parsing and indexing). On failure, it will entirely roll back all insertions and free the pattern itself. It also preserves the generation number so that it is convenient for use in association with pat_ref_append(). pat_ref_add() was modified to rely on it instead of open-coding the insertion and roll-back.	2020-11-05 19:27:09 +01:00
Willy Tarreau	c93da6950e	MEDIUM: pattern: only match patterns that match the current generation Instead of matching any pattern found in the tree, only match those matching the current generation of entries. This will make sure that reloads are atomic, regardless of the time they take to complete, and that newly added data are not matched until the whole reference is committed. For consistency we proceed the same way on "show map" and "show acl". This will have no impact for now since generations are not used.	2020-11-05 19:27:09 +01:00
Willy Tarreau	29947745b5	MINOR: pattern: store a generation number in the reference patterns Right now it's not possible to perform a safe reload because we don't know what patterns were recently added or were already present. This patch adds a generation counter to the reference patterns so that it is possible to know what generation of the reference they were loaded with. A reference now has two generations, the current one, used for all additions, and the next one, allocated to those wishing to update the contents. The generation wraps at 2^32 so comparisons must be made relative to the current position. The idea will be that upon full reload, the caller will first get a new generation ID, will insert all new patterns using it, will then switch the current ID to the new one, and will delete all entries older than the current ID. This has the benefit of supporting chunked updates that remain consistent and that won't block the whole process for ages like pat_ref_reload() currently does.	2020-11-05 19:27:09 +01:00
Willy Tarreau	1fd52f70e5	MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference Till now the only way to remove a known reference was via pat_ref_delete_by_id() which scans the whole list to find a matching pointer. Let's add pat_ref_delete_by_ptr() which takes a valid pointer. It can be called by the function above after the pointer is found, and can also be used to roll back a failed insertion much more efficiently.	2020-11-05 19:27:09 +01:00
Willy Tarreau	a98b2882ac	CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete() These ones are not used anymore, so let's remove them to remove a bit of the complexity. The ACL keyword's delete() function could be removed as well, though most keyword declarations are positional and we have a high risk of introducing a mistake here, so let's not touch the ACL part.	2020-11-05 19:27:09 +01:00
Willy Tarreau	b35aa9b256	CLEANUP: acl: don't reference the generic pattern deletion function anymore A few ACL keyword used to reference pat_delete_gen() as the deletion function but this is not needed since it's the default one now. Let's just remove this reference.	2020-11-05 19:27:09 +01:00
Willy Tarreau	e828d8f0e8	MINOR: pattern: perform a single call to pat_delete_gen() under the expression When we're removing an element under the expression lock, we don't need anymore to run over all ->delete() functions via the expressions, since we know that the single function does it fine now. Note that at this point, pattern->delete() is not used at all through out the code anymore.	2020-11-05 19:27:09 +01:00
Willy Tarreau	f1c0892aa6	MINOR: pattern: remerge the list and tree deletion functions pat_del_tree_gen() was already chained onto pat_del_list_gen() to deal with remaining cases, so let's complete the merge and have a generic pattern deletion function acting on the reference and taking care of reliably removing all elements.	2020-11-05 19:27:09 +01:00
Willy Tarreau	78777ead32	MEDIUM: pattern: change the pat_del_* functions to delete from the references This is the next step in speeding up entry removal. Now we don't scan the whole lists or trees for elements pointing to the target reference, instead we start from the reference and delete all linked patterns. This simplifies some delete functions since we don't need anymore to delete multiple times from an expression since all nodes appear after the reference element. We can now have one generic list and one generic tree deletion function. This required the replacement of pattern_delete() with an open-coded version since we now need to lock all expressions first before proceeding. This means there is a high risk of lock inversion here but given that the expressions are always scanned in the same order from the same head, this must not happen. Now deleting first entries is instantaneous, and it's still slow to delete the last ones when looking up their ID since it still requires to look them up by a full scan, but it's already way faster than previously. Typically removing the last 10 IP from a 20M entries ACL with a full-scan each took less than 2 seconds. It would be technically possible to make use of indexed entries to speed up most lookups for removal by value (e.g. IP addresses) but that's for later.	2020-11-05 19:27:09 +01:00
Willy Tarreau	4bdd0a13d6	MEDIUM: pattern: link all final elements from the reference There is a data model issue in the current pattern design that makes pattern deletion extremely expensive: there's no direct way from a reference to access all indexed occurrences. As such, the only way to remove all indexed entries corresponding to a reference update is to scan all expressions's lists and trees to find a link to the reference. While this was possibly OK when map removal was not common and most maps were small, this is not conceivable anymore with GeoIP maps containing 10M+ entries and del-map operations that are triggered from http-request rulesets. This patch introduces two list heads from the pattern reference, one for the objects linked by lists and one for those linked by tree node. Ideally a single list would be enough but the linked elements are too much unrelated to be distinguished at the moment, so we'll need two lists. However for the long term a single-linked list will suffice but for now it's not possible due to the way elements are removed from expressions. As such this patch adds 32 bytes of memory usage per reference plus 16 per indexed entry, but both will be cut in half later. The links are not yet used for deletion, this patch only ensures the list is always consistent.	2020-11-05 19:27:09 +01:00
Willy Tarreau	6d8a68914e	MINOR: pattern: make the delete and prune functions more generic Now we have a single prune() function to act on an expression, and one delete function for the lists and one for the trees. The presence of a pointer in the lists is enough to warrant a free, and we rely on the PAT_SF_REGFREE flag to decide whether to free using free() or regfree().	2020-11-05 19:27:09 +01:00
Willy Tarreau	9b5c8bbc89	MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed Currently we have no way to know how to delete/prune a pattern in a generic way. A pattern doesn't contain its own type so we don't know what function to call. Tree nodes are roughly OK but not lists where regex are possible. Let's add one new bit for sflags at index time to indicate that regex_free() will be needed upon deletion. It's not used for now.	2020-11-05 19:27:08 +01:00
Willy Tarreau	d4164dcd4a	CLEANUP: pattern: delete the back refs at once during pat_ref_reload() It's pointless to delete a backref and relink it to the next entry since the next entry is going to do the exact same and so on until all of them are deleted. Let's simply delete backrefs on reload.	2020-11-05 19:27:08 +01:00
Willy Tarreau	3ee0de1b41	MINOR: pattern: move the update revision to the pat_ref, not the expression It's not possible to uniquely update a single expression without updating the pattern reference, I don't know why we've put the revision in the expression back then, given that it in fact provides an update for a full pattern. Let's move the revision into the reference's head instead.	2020-11-05 19:27:08 +01:00
Willy Tarreau	114d698fde	MEDIUM: pattern: call malloc_trim() on pat_ref_reload() This is one case where we may release large amounts of data at once. Tests show that without this, after 10 full reloads of an ACL containing 1M IP addresses, the memory usage grew and stabilized around 1.7 GB of RSS. With this change, it stays around 260 MB and is stable across reloads.	2020-11-05 19:27:08 +01:00
Willy Tarreau	88366c2926	MEDIUM: pools: call malloc_trim() from pool_gc() If available it definitely makes sense to call it since it's also called when stopping to reclaim the maximum possible memory.	2020-11-05 19:27:08 +01:00
Baptiste Assmann	e279ca6bbe	MINOR: sample: Add converts to parses MQTT messages This patch implements a couple of converters to validate and extract data from a MQTT (Message Queuing Telemetry Transport) message. The validation consists of a few checks as well as "packet size" validation. The extraction can get any field from the variable header and the payload. This is limited to CONNECT and CONNACK packet types only. All other messages are considered as invalid. It is not a problem for now because only the first packet on each side can be parsed (CONNECT for the client and CONNACK for the server). MQTT 3.1.1 and 5.0 are supported. Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>	2020-11-05 19:27:03 +01:00
Baptiste Assmann	e138dda1e0	MINOR: sample: Add converters to parse FIX messages This patch implements a couple of converters to validate and extract tag value from a FIX (Financial Information eXchange) message. The validation consists in a few checks such as mandatory fields and checksum computation. The extraction can get any tag value based on a tag string or tag id. This patch requires the istend() function. Thus it depends on "MINOR: ist: Add istend() function to return a pointer to the end of the string". Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>	2020-11-05 19:26:30 +01:00
Ilya Shipitsin	0aa8c29460	BUILD: ssl: use feature macros for detecting ec curves manipulation support Let us use SSL_CTX_set1_curves_list, defined by OpenSSL, as well as in openssl-compat when SSL_CTRL_SET_CURVES_LIST is present (BoringSSL), for feature detection instead of versions.	2020-11-05 15:08:41 +01:00
William Lallemand	99e0bb997f	MINOR: mworker/cli: the master CLI use its own applet Following the patch b4daee ("MINOR: sock: add a check against cross worker<->master socket activities"), this patch adds a dedicated applet for the master CLI. It ensures that the CLI connection can't be used with the master rights in the case of bugs.	2020-11-05 10:28:53 +01:00
Willy Tarreau	21b9ff59b2	BUG/MEDIUM: server: make it possible to kill last idle connections In issue #933, @jaroslawr provided a report indicating that when using many threads and many servers, it's very difficult to terminate the last idle connections on each server. The issue has two causes in fact. The first one is that during the calculation of the estimate of needed connections, we round the computation up while in previous round it was already rounded up, so we end up adding 1 to 1 which once divided by 2 remains 1. The second issue is that servers are not woken up anymore for purging their connections if they don't have activity. The only reason that was there to wake them up again was in case insufficient connections were purged. And even then the purge task itself was not woken up. But that is not enough for getting rid of the long tail of old connections nor updating est_need_conns. This patch makes sure to properly wake up as long as at least one idle connection remains, and not to round up the needed connections anymore. Prior to this patch, a test involving many connections which suddenly stopped would keep many idle connections, now they're effectively halved every pool-purge-delay. This needs to be backported to 2.2.	2020-11-05 09:12:20 +01:00
Willy Tarreau	b4daeeb094	MINOR: sock: add a check against cross worker<->master socket activities Given that the previous issues caused spurious worker socket wakeups in the master for inherited FDs that couldn't be closed, let's add a strict test in the I/O callback to make sure that an accept() event is always caught by the appropriate type of process (master for master listeners, worker for worker listeners).	2020-11-04 15:05:50 +01:00
Christopher Faulet	fafd1b0a5b	CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream Since the h2 multiplexer no longer relies on the legacy HTTP representation, and uses exclusively the HTX, the H1 parser state (h1m) is no longer used by the h2 streams. Thus it can be removed. This patch may be backported as far as 2.1.	2020-11-04 15:02:24 +01:00
Willy Tarreau	a4380b211f	MEDIUM: listeners: make use of fd_want_recv_safe() to enable early receivers We used to refrain from calling fd_want_recv() if fd_updt was not allocated but it's not the right solution as this does not allow the FD to be set. Instead, let's use the new fd_want_recv_safe() which will update the FD and create an update entry only if possible. In addition, the equivalent test before calling fd_stop_recv() was removed as totally useless since there's not fd_updt creation in this case.	2020-11-04 14:22:42 +01:00
Willy Tarreau	22ccd5ebaf	BUG/MEDIUM: listener: make the master also keep workers' inherited FDs In commit `374e9af35` ("MEDIUM: listener: let do_unbind_listener() decide whether to close or not") it didn't appear necessary to have the master process keep open the workers' inherited FDs. But this is actually necessary to handle the reload on "bind fd@foo" situations, otherwise the FD may be reassigned and the new socket cannot be set up, sometimes causing "socket operation on non-socket" or other types of errors. William found that this was the cause for the consistent failures of the abns regtest, which already used to fail very often before this and was as such marked as broken. Interestingly I didn't have this issue with my test configs because the FD number I used was higher and within the range of other listening sockets. But this means that one of these wouldn't work as expected. No backport is needed, this was introduced as part of the listeners rework in 2.3.	2020-11-04 14:22:42 +01:00
Willy Tarreau	59b5da4873	BUG/MEDIUM: listener: never suspend inherited sockets It is not acceptable to suspend an inherited socket because we'd kill its listening state, making it possibly unrecoverable for future processes. The situation which can trigger this is when there is an abns socket in a config and an inherited FD on another listener. Upon soft reload, the abns fails to bind, a SIGTTOU is sent to the old process which suspends everything, including the inherited FD, then the new process can bind and tell the old one to quit. Except that the new FD was not set back to the listen state, which is detected by listener_accept() which can pause it. It's only upon second reload that the FD works again. The solution is to refrain from suspending such FDs since we don't own them. And the next process will get them right anyway from its config. For now only TCP and UDP face this issue so it's better to address this on a protocol basis No backport is needed, this is related to the new listeners in 2.3.	2020-11-04 14:22:42 +01:00

1 2 3 4 5 ...

10520 Commits