haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-13 10:37:02 +02:00

Author	SHA1	Message	Date
Willy Tarreau	38dba27d4d	BUG/MEDIUM: listener: only enable a listening listener if needed The test on listener->state == LI_LISTEN is not sufficient to decide if we need to enable a listener. Indeed, there is a very special case which is the inherited FD shared, which has to reflect the real socket state even after the previous test, and as such needs to remain in LI_LISTEN state. In this case we don't want a worker to start the master's listener nor conversely. Let's add a specific test for this.	2020-11-04 14:22:42 +01:00
Willy Tarreau	dfe79251da	BUG/MEDIUM: stick-table: limit the time spent purging old entries An interesting case was reported with threads and moderately sized stick-tables. Sometimes the watchdog would trigger during the purge. It turns out that the stick tables were sized in the 10s of K entries which is the order of magnitude of the possible number of connections, and that threads were used over distinct NUMA nodes. While at first glance nothing looks problematic there, actually there is a risk that a thread trying to purge the table faces 100% of entries still in use by a connection with (ts->ref_cnt > 0), and ends up scanning the whole table, while other threads on the other NUMA node are causing the cache lines to bounce back and forth and considerably slow down its progress to the point of possibly spending hundreds of milliseconds there, multiplied by the number of queued threads all failing on the same point. Interestingly, smaller tables would not trigger it because the scan would be faster, and larger ones would not trigger it because plenty of entries would be idle! The most efficient solution is to increase the table size to be large enough for this never to happen, but this is not reliable. We could have a parallel list of idle entries but that would significantly increase the storage and processing cost only to improve a few rare corner cases. This patch takes a more pragmatic approach, it considers that it will not visit more than twice the number of nodes to be deleted, which means that it accepts to fail up to 50% of the time. Given that very small batches are programmed each time (1/256 of the table size), this means the operation will finish quickly (128 times faster than now), and will reduce the inter-thread contention. If this needs to be reconsidered, it will probably mean that the batch size needs to be fixed differently. This needs to be backported to stable releases which extensively use threads, typically 2.0. Kudos to Nenad Merdanovic for figuring the root cause triggering this!	2020-11-03 18:02:42 +01:00
Amaury Denoyelle	e6ee820c07	MINOR: stats: do not display empty stat module title on html If a stat module is not available on the current proxy scope, do not display its title on the related html box. This is clearer for the user.	2020-11-03 17:04:22 +01:00
Amaury Denoyelle	e7b891f7d3	MINOR: mux_h2: add stat for total count of connections/streams Add counters for total number of http2 connections/stream since haproxy startup. Contrary to open_conn/stream, they are never reset to zero.	2020-11-03 17:04:22 +01:00
Amaury Denoyelle	2ac34d97a6	MINOR: mux_h2: capitalize frame type in stats http/2 frame type names are capitalized in the rfc, use the same notation on the stats labels.	2020-11-03 17:04:22 +01:00
Christopher Faulet	743bd6adc8	BUG/MINOR: filters: Skip disabled proxies during startup only This partially reverts the patch `400829cd2` ("BUG/MEDIUM: filters: Don't try to init filters for disabled proxies"). Disabled proxies must not be skipped in flt_deinit() and flt_deinit_all_per_thread() when HAProxy is stopped because, obvioulsy, at this step, all proxies appear as disabled (or stopped, it is the same state). It is safe to do so because, during startup, filters declared on disabled proxies are removed. Thus they don't exist anymore during shutdown. This patch must be backported in all versions where the patch above is.	2020-11-03 16:51:48 +01:00
Ilya Shipitsin	04a5a440b8	BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions let us use HAVE_OPENSSL_KEYLOG for feature detection instead of versions	2020-11-03 14:54:15 +01:00
Christopher Faulet	5a7ca29061	BUG/MEDIUM: mux-pt: Release the tasklet during an HTTP upgrade When a TCP connection is upgraded to HTTP, the passthrough multiplexer owning the client connection is detroyed and replaced by an HTTP multiplexer. When it happens, the connection context is changed (it is in fact the mux itself). Thus, when the mux-pt is destroyed, the connection is not released. But, only the connection must be kept. Everything else concerning the mux must be released. Especially, the tasklet used for I/O subscriptions. In this part, there was a bug and the tasklet was never released. This patch should fix the issue #935. It must be backported as far as 2.0.	2020-11-03 10:50:00 +01:00
Christopher Faulet	75bef00538	MINOR: server: Copy configuration file and line for server templates When servers based on server templates are initialized, the configuration file and line are now copied. This helps to emit understandable warning and alert messages. This patch may be backported if needed, as far as 1.8.	2020-11-03 10:44:38 +01:00
Christopher Faulet	ac1c60fd9c	BUG/MINOR: server: Set server without addr but with dns in RMAINT on startup On startup, if a server has no address but the dns resolutions are configured, "none" method is added to the default init-addr methods, in addition to "last" and "libc". Thus on startup, this server is set to RMAINT mode if no address is found. It is only performed if no other init-addr method is configured. Setting the RMAINT mode on startup is important to inhibit the health checks. For instance, following servers will now be set to RMAINT mode on startup : server srv nofound.tld:80 check resolvers mydns server srv _http._tcp.service.local check resolvers mydns server-template srv 1-3 _http._tcp.service.local check resolvers mydns while followings ones will trigger an error : server srv nofound.tld:80 check server srv nofound.tld:80 check resolvers mydns init-addr libc server srv _http._tcp.service.local check server srv _http._tcp.service.local check resolvers mydns init-addr libc server-template srv 1-3 _http._tcp.service.local check resolvers mydns init-addr libc This patch must be backported as far as 1.8.	2020-11-03 10:44:26 +01:00
Christopher Faulet	5e29376efb	BUG/MINOR: checks: Report a socket error before any connection attempt When a health-check fails, if no connection attempt was performed, a socket error must be reported. But this was only done if the connection was not allocated. It must also be done if there is no control layer. Otherwise, a L7TOUT will be reported instead. It is possible to not having a control layer for a connection if the connection address family is invalid or not defined. This patch must be backported to 2.2.	2020-11-03 10:23:00 +01:00
Christopher Faulet	d5bd824b81	BUG/MINOR: proxy/server: Skip per-proxy/server post-check for disabled proxies per-proxy and per-server post-check callback functions must be skipped for disabled proxies because most of the configuration validity check is skipped for these proxies. This patch must be backported as far as 2.1.	2020-11-03 10:23:00 +01:00
Christopher Faulet	400829cd2c	BUG/MEDIUM: filters: Don't try to init filters for disabled proxies Configuration is parsed for such proxies but not validated. Concretely, it means check_config_validity() function does almost nothing for such proxies. Thus, we must be careful to not initialize filters for disabled proxies because the check callback function is not called. In fact, to be sure to avoid any trouble, filters for disabled proxies are released. This patch fixes a segfault at startup if the SPOE is configured for a disabled proxy. It must be backported as far as 1.7 (maybe with some adaptations).	2020-11-03 10:23:00 +01:00
Ilya Shipitsin	c9dfee43f3	BUILD: ssl: use SSL_CTRL_GET_RAW_CIPHERLIST instead of OpenSSL versions let us use SSL_CTRL_GET_RAW_CIPHERLIST for feature detection instead of versions [wla: SSL_CTRL_GET_RAW_CIPHERLIST was introduced by OpenSSL commit 94a209 along with SSL_CIPHER_find. It was removed in boringSSL.] Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2020-11-03 09:24:43 +01:00
Willy Tarreau	a5bbaaf9f4	CLEANUP: pattern: fix spelling/grammatical/copy-paste in comments The code is horrible to work with because most functions are documented with misleading comments resulting from many spelling and grammatical mistakes, and plenty of remains of copy-paste mentioning arguments that do not exist and return values that are never set. Too many hours wasted writing non-working code because of assumptions resulting from this, let's fix this once for all now!	2020-10-31 13:14:10 +01:00
Willy Tarreau	8135d9bc0c	CLEANUP: pattern: use calloc() rather than malloc for structures It's particularly difficult to make sure that the various pattern structures are properly initialized given that they can be allocated at multiple places and systematically via malloc() instead of calloc(), thus not even leaving the possibility of default values. Let's adjust a few of them.	2020-10-31 13:14:10 +01:00
Willy Tarreau	6bedf151e1	MINOR: pattern: export pat_ref_push() Strangely this one was marked static inline within the file itself. Let's export it.	2020-10-31 13:13:48 +01:00
Willy Tarreau	6a1740767c	MINOR: pattern: make pat_ref_add() rely on pat_ref_append() Let's remove unneeded code duplication, both are exactly the same.	2020-10-31 13:13:48 +01:00
Willy Tarreau	f4edb72e0a	MINOR: pattern: make pat_ref_append() return the newly added element It's more convenient to return the element than to return just 0 or 1, as the next thing we'll want to do is to act on this element! In addition it was using variable arguments instead of consts, causing some reuse constraints which were also addressed. This doesn't change its use as a boolean, hence why call places were not modified.	2020-10-31 13:13:48 +01:00
Remi Tricot-Le Breton	8c2db71326	BUG/MINOR: cache: Inverted variables in http_calc_maxage function The maxage and smaxage variables were inadvertently assigned the Cache-Control s-maxage and max-age values respectively when it should have been the other way around. This can be backported on all branches after 1.8 (included).	2020-10-30 14:29:29 +01:00
Remi Tricot-Le Breton	40ed97b04b	BUG/MINOR: cache: Manage multiple values in cache-control header value If an HTTP request or response had a "Cache-Control" header that had multiple comma-separated subparts in its value (like "max-age=1, no-store" for instance), we did not process the values correctly and only parsed the first one. That made us store some HTTP responses in the cache when they were explicitely uncacheable. This patch replaces the way the values are parsed by an http_find_header loop that manages every sub part of the value independently. This patch should be backported to 2.2 and 2.1. The bug also exists on previous versions but since the sources changed, a new commit will have to be created. [wla: This patch requires `bb4582c` ("MINOR: ist: Add a case insensitive istmatch function"). Backporting for < 2.1 is not a requirement since it works well enough for most cases, it was a known limitation of the implementation of non-htx version too]	2020-10-30 13:28:34 +01:00
Remi Tricot-Le Breton	a6476114ec	MINOR: cache: Add Expires header value parsing When no Cache-Control max-age or s-maxage information is present in a cached response, we need to parse the Expires header value (RFC 7234#5.3). An invalid Expires date value or a date earlier than the reception date will make the cache_entry stale upon creation. For now, the Cache-Control and Expires headers are parsed after the insertion of the response in the cache so even if the parsing of the Expires results in an already stale entry, the entry will exist in the cache.	2020-10-30 11:08:38 +01:00
Amaury Denoyelle	bc0af6a199	BUG/MINOR: lua: initialize sample before using it Memset the sample before using it through hlua_lua2smp. This function is ORing the smp.flags, so this field need to be cleared before its use. This was reported by a coverity warning. Fixes the github issue #929. This bug can be backported up to 1.8.	2020-10-29 18:52:44 +01:00
Amaury Denoyelle	e6ba7915eb	BUG/MINOR: server: fix down_time report for stats Adjust condition used to report down_time for statistics. There was a tiny probabilty to have a negative downtime if last_change was superior to now. If this is the case, return only down_time. This bug can backported up to 1.8.	2020-10-29 18:52:39 +01:00
Amaury Denoyelle	fe2bf091f6	BUG/MINOR: server: fix srv downtime calcul on starting When a server is up after a failure, its downtime was reset to 0 on the statistics. This is due to a wrong condition that causes srv.down_time to never be set. Fix this by updating down_time each time the server is in STARTING state. Fixes the github issue #920. This bug can be backported up to 1.8.	2020-10-29 18:52:18 +01:00
Amaury Denoyelle	66942c1d4d	MINOR: mux-h2: count open connections/streams on stats Implement as a gauge h2 counters for currently open connections and streams. The counters are decremented when closing the stream or the connection.	2020-10-28 08:55:23 +01:00
Amaury Denoyelle	a8879238ce	MINOR: mux-h2: report detected error on stats Implement counters for h2 protocol error on connection or stream level. Also count the total number of rst_stream and goaway frames sent by the mux in response to a detected error.	2020-10-28 08:55:19 +01:00
Amaury Denoyelle	2dec1ebec2	MINOR: mux-h2: add stats for received frame types Implement counters for h2 frame received based on their type for HEADERS, DATA, SETTINGS, RST_STREAM and GOAWAY.	2020-10-28 08:55:16 +01:00
Amaury Denoyelle	c92697d977	MINOR: mux-h2: add counters instance to h2c Add pointer to counters as a member for h2c structure. This pointer is initialized on h2_init function. This is useful to quickly access and manipulate the counters inside every h2 functions.	2020-10-28 08:55:11 +01:00
Amaury Denoyelle	3238b3f906	MINOR: mux-h2: register a stats module Use statistics API to register a new stats module generating counters on h2 module. The counters are attached to frontend/backend instances.	2020-10-28 08:55:07 +01:00
Remi Tricot-Le Breton	bf97121f1c	MINOR: cache: Create res.cache_hit and res.cache_name sample fetches Res.cache_hit sample fetch returns a boolean which is true when the HTTP response was built out of a cache. The cache's name is returned by the res.cache_name sample_fetch. This resolves GitHub issue #900.	2020-10-27 18:25:43 +01:00
Remi Tricot-Le Breton	53161d81b8	MINOR: cache: Process the If-Modified-Since header in conditional requests If a client sends a conditional request containing an If-Modified-Since header (and no If-None-Match header), we try to compare the date with the one stored in the cache entry (coming either from a Last-Modified head, or a Date header, or corresponding to the first response's reception time). If the request's date is earlier than the stored one, we send a "304 Not Modified" response back. Otherwise, the stored is sent (through a 200 OK response). This resolves GitHub issue #821.	2020-10-27 18:10:25 +01:00
Remi Tricot Le Breton	27091b4dd0	MINOR: cache: Store the "Last-Modified" date in the cache_entry In order to manage "If-Modified-Since" requests, we need to keep a reference time for our cache entries (to which the conditional request's date will be compared). This reference is either extracted from the "Last-Modified" header, or the "Date" header, or the reception time of the response (in decreasing order of priority). The date values are converted into seconds since epoch in order to ease comparisons and to limit storage space.	2020-10-27 18:10:25 +01:00
Tim Duesterhus	e0142340b2	BUG/MINOR: cache: Check the return value of http_replace_res_status Send the full body if the status `304` cannot be applied. This should be the most graceful failure. Specific for 2.3, no backport needed.	2020-10-27 17:01:49 +01:00
Ilya Shipitsin	b9b84a4b25	BUILD: ssl: more elegant OpenSSL early data support check BorinSSL pretends to be 1.1.1 version of OpenSSL. It messes some version based feature presense checks. For example, OpenSSL specific early data support. Let us change that feature detction to SSL_READ_EARLY_DATA_SUCCESS macro check instead of version comparision.	2020-10-27 13:08:32 +01:00
Willy Tarreau	a0133fcf35	BUG/MINOR: log: fix risk of null deref on error path Previous commit `ae32ac74db` ("BUG/MINOR: log: fix memory leak on logsrv parse error") addressed one issue and introduced another one, the logsrv pointer may also be null at the end of the function so we must test it before deciding to dereference it. This should be backported along with the patch above to 2.2.	2020-10-27 10:35:32 +01:00
Willy Tarreau	ae32ac74db	BUG/MINOR: log: fix memory leak on logsrv parse error In case of parsing error on logsrv, we can leave parse_logsrv() without releasing logsrv->ring_name or smp_rgs. Let's free them on the error path. This should fix issue #926 detected by Coverity. The impact is only a tiny leak just before reporting a fatal error, so it will essentially annoy valgrind. This can be backported to 2.0 (just drop the ring part).	2020-10-27 09:55:00 +01:00
Emmanuel Hocdet	a73a222a98	BUG/MEDIUM: ssl: OCSP must work with BoringSSL It's a regression from `b3201a3e` "BUG/MINOR: disable dynamic OCSP load with BoringSSL". The origin bug is link to `76b4a12` "BUG/MEDIUM: ssl: memory leak of ocsp data at SSL_CTX_free()": ssl_sock_free_ocsp() shoud be in #ifndef OPENSSL_IS_BORINGSSL. To avoid long #ifdef for small code, the BoringSSL part for ocsp load is isolated in a simple #ifdef. This must be backported in 2.2 and 2.1	2020-10-27 09:38:51 +01:00
William Dauchy	5e10e44bce	CLEANUP: http_ana: remove unused assignation of `att_beg` `att_beg` is assigned to `next` at the end of the `for` loop, but is assigned to `prev` at the beginning of the loop, which is itself assigned to `next` after each loop. So it represents a double assignation for the same value. Also `att_beg` is not used after the end of the loop. this is a partial fix for github issue #923, all the others could probably be marked as intentional to protect future changes. no backport needed. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-10-26 15:00:09 +01:00
Willy Tarreau	b3250a268b	BUG/MINOR: extcheck: add missing checks on extchk_setenv() Issue #910 reports that we fail to check a few extchk_setenv() in the child process. These are mostly harmless, but instead of counting on the external check script to fail the dirty way, better fail cleanly when detecting the failure. This could probably be backported to all stable branches.	2020-10-24 13:07:39 +02:00
Willy Tarreau	5472aa50f1	BUG/MEDIUM: queue: fix unsafe proxy pointer when counting nbpend As reported by Coverity in issue #917, commit `96bca33` ("OPTIM: queue: decrement the nbpend and totpend counters outside of the lock") introduced a bug when moving the increments outside of the loop, because we can't always rely on the pendconn "p" here as it may be null. We can retrieve the proxy pointer directly from s->proxy instead. The same is true for pendconn_redistribute(), though the last "p" pointer there was still valid. This patch fixes both. No backport is needed, this was introduced just before 2.3-dev8.	2020-10-24 12:57:41 +02:00
Willy Tarreau	bd71510024	MINOR: stats: report server's user-configured weight next to effective weight The "weight" column on the stats page is somewhat confusing when using slowstart becaue it reports the effective weight, without being really explicit about it. In some situations the user-configured weight is more relevant (especially with long slowstarts where it's important to know if the configured weight is correct). This adds a new uweight stat which reports a server's user-configured weight, and in a backend it receives the sum of all servers' uweights. In addition it adds the mention of "effective" in a few descriptions for the "weight" column (help and doc). As a result, the list of servers in a backend is now always scanned when dumping the stats. But this is not a problem given that these servers are already scanned anyway and for way heavier processing.	2020-10-23 22:47:30 +02:00
William Lallemand	089c13850f	MEDIUM: ssl: ssl-load-extra-del-ext work only with .crt In order to be compatible with the "set ssl cert" command of the CLI, this patch restrict the ssl-load-extra-del-ext to files with a ".crt" extension in the configuration. Related to issue #785. Should be backported where `8e8581e` ("MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension") was backported.	2020-10-23 18:41:08 +02:00
Willy Tarreau	2fbe6940f4	MINOR: stats: indicate the number of servers in a backend's status When dumping the stats page (or the CSV output), when many states are mixed, it's hard to figure the number of up servers. But when showing only the "up" servers or hiding the "maint" servers, there's no way to know how many servers are configured, which is problematic when trying to update server-templates. What this patch does, for dumps in "up" or "no-maint" modes, is to add after the backend's "UP" or "DOWN" state "(%d/%d)" indicating the number of servers seen as UP to the total number of servers in the backend. As such, seeing "UP (33/39)" immediately tells that there are 6 servers that are not listed when using "up", or will let the client figure how many servers are left once deducted the number of non-maintenance ones. It's not done on default dumps so as not to disturb existing tools, which already have all the information they need in the dump.	2020-10-23 18:11:30 +02:00
Willy Tarreau	3e32036701	MINOR: stats: also support a "no-maint" show stat modifier "no-maint" is a bit similar to "up" except that it will only hide servers that are in maintenance (or disabled in the configuration), and not those that are enabled but failed a check. One benefit here is to significantly reduce the output of the "show stat" command when using large server-templates containing entries that are not yet provisioned. Note that the prometheus exporter also has such an option which does the exact same.	2020-10-23 18:11:24 +02:00
Willy Tarreau	65141ffc4f	MINOR: stats: support the "up" output modifier for "show stat" We already had it on the HTTP interface but it was not accessible on the CLI. It can be very convenient to hide servers which are down, do not resolve, or are in maintenance.	2020-10-23 18:11:24 +02:00
Willy Tarreau	8ae8c48eb0	MEDIUM: fwlc: re-enable per-server queuing up to maxqueue Leastconn has the nice propery of being able to sort servers by their current usage. It's really a shame to force all requests into the backend queue when the algo would be able to also consider their current queue. In order not to change existing behavior but extend it, this patch allows leastconn to elect servers which are already full if they have an explicitly configured maxqueue setting above zero and their queue hasn't reached that threshold. This will significantly reduce the pressure in the backend queue when queuing a lot with lots of servers. A test on 8 threads with 100 servers configured with maxconn 1 jumped from 165krps to 330krps with maxqueue 15 with this patch. This partially undoes commit `82cd5c13a` ("OPTIM: backend: skip LB when we know the backend is full") but allows to scale much better even by setting a single-digit maxqueue value. Some better heuristics could be used to maintain the behavior of the bypass in the patch above, consisting in keeping it if it's known that there is no server with a configured maxqueue in the farm (or in the backend).	2020-10-22 18:30:25 +02:00
Willy Tarreau	8c855f6cff	MINOR: leastconn: take the queue length into account when queuing servers When servers are queued into the leastconn tree, it's important to also consider their queue length. There could be some servers with lots of queued requests that we don't want to hammer with extra connections. In order not to add extra stress to the LB algorithm, we don't update the value when adding to the queue, only when updating the connection count (i.e. picking from the queue or releasing a connection). This will be sufficient to significantly improve the fairness in such situations.	2020-10-22 18:30:18 +02:00
Willy Tarreau	96bca33d75	OPTIM: queue: decrement the nbpend and totpend counters outside of the lock We don't need to do that inside the lock. However since the operation used to be done in deep functions, we have to make it resurface closer to visible parts. It remains reasonably self-contained in queue.c so that's not that big of a deal. Some places (redistribute) could benefit from a single operation for all counts at once. Others like pendconn_process_next_strm() are still called with both locks held but now it will be possible to change this.	2020-10-22 17:32:28 +02:00
Willy Tarreau	56c1cfb179	OPTIM: queue: make the nbpend counters atomic Instead of incrementing, decrementing them and updating their max under the lock, make them atomic and keep them out of the lock as much as possible. For __pendconn_unlink_* it would be wide to decide to move these counters outside of the function, inside the callers so that a single atomic op can be done per counter even for groups of operations.	2020-10-22 17:32:28 +02:00
Willy Tarreau	c7eedf7a5a	MINOR: queue: reduce the locked area in pendconn_add() Similarly to previous changes, we know if we're dealing with a server or proxy lock so let's directly lock at the finest possible places there. It's worth noting that a part of the operation consisting in an increment and update of a max could be done outside of the lock using atomic ops and a CAS.	2020-10-22 17:32:28 +02:00
Willy Tarreau	3e3ae2524d	MINOR: queue: split __pendconn_unlink() in per-srv and per-prx The function is called with the lock held and does too many tests for things that are already known from its callers. Let's split it in two so that its callers call either the per-server or per-proxy function depending on where the element is (since they had to determine it prior to taking the lock).	2020-10-22 17:32:28 +02:00
Willy Tarreau	5503908bdc	MINOR: proxy/cli: only take a read lock in "show errors" There's no point having an exclusive lock here, nothing is modified.	2020-10-22 17:32:28 +02:00
Willy Tarreau	595e767030	MINOR: server: read-lock the cookie during srv_set_dyncookie() No need to use an exclusive lock on the proxy anymore when reading its setting, a read lock is enough. A few other places continue to use a write-lock when modifying simple flags only in order to let this function see a consistent value all along. This might be changed in the future using barriers and local copies.	2020-10-22 17:32:28 +02:00
Willy Tarreau	ac66d6bafb	MINOR: proxy; replace the spinlock with an rwlock This is an anticipation of finer grained locking for the queues. For now all lock places take a write lock so that there is no difference at all with previous code.	2020-10-22 17:32:28 +02:00
Christopher Faulet	9a3d3fcb5d	BUG/MAJOR: mux-h2: Don't try to send data if we know it is no longer possible In h2_send(), if we are in a state where we know it is no longer possible to send data, we must exit the sending loop to avoid any possiblity to loop forever. It may happen if the mbuf ring is released while the H2_CF_MUX_MFULL flag is still set. Here is a possible scenario to trigger the bug : 1) The mbuf ring is full because we are unable to send data. The H2_CF_MUX_MFULL flag is set on the H2 connection. 2) At this stage, the task timeout expires because the H2 connection is blocked. We enter in h2_timeout_task() function. Because the mbuf ring is full, we cannot send the GOAWAY frame. Thus the H2_CF_GOAWAY_FAILED flag is set. The H2 connection is not released yet because there is still a stream attached. Here we leave h2_timeout_task() function. 3) A bit later, the H2 connection is woken up. If h2_process(), nothing is performed by the first attempt to send data, in h2_send(). Then, because the H2_CF_GOAWAY_FAILED flag is set, the mbuf ring is released. But the H2_CF_MUX_MFULL flag is still there. At this step a second attempt to send data is performed. 4) In h2_send(), we try to send data in a loop. To exist this loop, done variable must be set to 1. Because the H2_CF_MUX_MFULL flag is set, we don't call h2_process_mux() and done is not updated. Because the mbuf ring is now empty, nothing is sent and the H2_CF_MUX_MFULL flag is never removed. Now, we loop forever... waiting for the watchdog. To fix the bug, we now exit the loop if one of these conditions is true : - The H2_CF_GOAWAY_FAILED flag is set on the H2 connection - The CO_FL_SOCK_WR_SH flag is set on the underlying connection - The H2 connection is in the H2_CS_ERROR2 state This patch should fix the issue #912 and most probably #875. It must be backported as far as the 1.8.	2020-10-22 17:13:22 +02:00
Christopher Faulet	d6c48366b8	BUG/MINOR: http-ana: Don't send payload for internal responses to HEAD requests When an internal response is returned to a client, the message payload must be skipped if it is a reply to a HEAD request. The payload is removed from the HTX message just before the message forwarding. This bugs has been around for a long time. It was already there in the pre-HTX versions. In legacy HTTP mode, internal errors are not parsed. So this bug cannot be easily fixed. Thus, this patch should only be backported in all HTX versions, as far as 2.0. However, the code has significantly changed in the 2.2. Thus in the 2.1 and 2.0, the patch must be entirely reworked.	2020-10-22 17:13:22 +02:00
Tim Duesterhus	6414cd1fc0	CLEANUP: compression: Make use of http_get_etag_type() This commit makes the compressor use http_get_etag_type to validate the ETag instead of using an ad-hoc condition.	2020-10-22 16:59:36 +02:00
Remi Tricot-Le Breton	6cb10384a3	MEDIUM: cache: Add support for 'If-None-Match' request header Partial support of conditional HTTP requests. This commit adds the support of the 'If-None-Match' header (see RFC 7232#3.2). When a client specifies a list of ETags through one or more 'If-None-Match' headers, they are all compared to the one that might have been stored in the corresponding http cache entry until one of them matches. If a match happens, a specific "304 Not Modified" response is sent instead of the cached data. This response has all the stored headers but no other data (see RFC 7232#4.1). Otherwise, the whole cached data is sent. Although unlikely in a GET/HEAD request, the "If-None-Match: *" syntax is valid and also receives a "304 Not Modified" response (RFC 7434#4.3.2). This resolves a part of GitHub issue #821.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	dbb65b5a7a	MEDIUM: cache: Store the ETag information in the cache_entry When sent by a server for a given resource, the ETag header is stored in the coresponding cache entry (as any other header). So in order to perform future ETag comparisons (for subsequent conditional HTTP requests), we keep the length of the ETag and its offset relative to the start of the cache_entry. If no ETag header exists, the length and offset are zero.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	bcced09b91	MINOR: http: Add etag comparison function Add a function that compares two etags that might be of different types. If any of them is weak, the 'W/' prefix is discarded and a strict string comparison is performed. Co-authored-by: Tim Duesterhus <tim@bastelstu.be>	2020-10-22 16:06:20 +02:00
Willy Tarreau	1e690bb6c4	BUG/MEDIUM: server: support changing the slowstart value from state-file If the slowstart value in a state file implies the latest state change is within the slowstart period, we end up calling srv_update_status() to reschedule the server's state change but its task is not yet allocated and remains null, causing a crash on startup. Make sure srv_update_status() supports being called with partially initialized servers which do not yet have a task. If the task has to be scheduled, it will necessarily happen after initialization since it will result from a state change. This should be backported wherever server-state is present.	2020-10-22 12:07:07 +02:00
Willy Tarreau	ef71f0194c	BUG/MINOR: queue: properly report redistributed connections In commit `5cd4bbd7a` ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management") the counter of transferred connections was accidently lost, so that when a server goes down with connections in its queue, it will always be reported that 0 connection were transferred. This should be backported as far as 1.8 since the patch above was backported there.	2020-10-21 12:04:53 +02:00
William Lallemand	8e8581e242	MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension In issue #785, users are reporting that it's not convenient to load a ".crt.key" when the configuration contains a ".crt". This option allows to remove the extension of the certificate before trying to load any extra SSL file (.key, .ocsp, .sctl, .issuer etc.) The patch changes a little bit the way ssl_sock_load_files_into_ckch() looks for the file.	2020-10-20 18:25:46 +02:00
William Dauchy	835712ad90	BUG/MINOR: listener: close before free in `listener_accept` safer to close handle before the object is put back in the global pool. this was introduced by commit `9378bbe0be` ("MEDIUM: listener: use protocol->accept_conn() to accept a connection") this should fix github issue #902 no backport needed. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-10-20 15:40:36 +02:00
Willy Tarreau	f42d794d96	MEDIUM: config: report that "nbproc" is deprecated As previously discussed, nbproc usage is bad, deprecated, and scheduled for removal in 2.5. If "nbproc" is found with more than one process while nbthread is not set, a warning will be emitted encouraging to remove it or to migrate to nbthread instead. This makes sure the user has an opportunity to both see the message and silence it.	2020-10-20 11:54:49 +02:00
Willy Tarreau	69a7b8fc6c	CLEANUP: task: remove the unused and mishandled global_rqueue_size This counter is only updated and never used, and in addition it's done without any atomicity so it's very unlikely to be correct on multi-CPU systems! Let's just remove it since it's not used.	2020-10-19 14:08:13 +02:00
Willy Tarreau	3d18498645	CLEANUP: threads: don't register an initcall when not debugging It's a bit overkill to register an initcall to call a function to set a lock to zero when not debugging, let's just declare the lock as pre-initialized to zero.	2020-10-19 14:08:13 +02:00
Ilya Shipitsin	b3201a3e07	BUG/MINOR: disable dynamic OCSP load with BoringSSL it was accidently enabled on BoringSSL while actually it is not supported wla: Fix part of the issue mentionned in #895. It fixes build of boringSSL versions prior to commit https://boringssl.googlesource.com/boringssl/+/49e9f67d8b7cbeb3953b5548ad1009d15947a523 Must be backported in 2.2. Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2020-10-19 11:00:51 +02:00
Willy Tarreau	4b6e3c284a	MINOR: lb/chash: use a read lock in chash_get_server_hash() When using a low hash-balance-factor value, it's possible to loop many times trying to find the best server. Figures in the order of 100-300 times were observed for 1000 servers with a factor of 101 (which seems a bit excessive for such a large farm). Given that there's nothing in that function that prevents multiple threads from working in parallel, let's switch to a read lock. Tests on 8 threads show roughly a 2% performance increase with this.	2020-10-17 20:15:49 +02:00
Willy Tarreau	f76a21f78c	MINOR: lb/first: use a read lock in fas_get_next_server() The "first" algorithm creates a lot of contention because all threads focus on the same server by definition (the first available one). By turning the exclusive lock to a read lock in fas_get_next_server(), the request rate increases by 16% for 8 threads when many servers are getting close to their maxconn.	2020-10-17 19:49:49 +02:00
Willy Tarreau	58bc9c1ced	MINOR: lb/leastconn: only take a read lock in fwlc_get_next_server() This function doesn't change the tree, it only looks for the first usable server, so let's do that under a read lock to limit the situations like the ones described in issue #881 where finding a usable server when dealing with lots of saturated ones can be expensive. At least threads will now be able to look up in parallel. It's interesting to note that s->served is not incremented during the server choice, nor is the server repositionned. So right now already, nothing prevents multiple threads from picking the same server. This will not cause a significant imbalance anyway given that the server will automatically be repositionned at the right place, but this might be something to improve in the future if it doesn't come with too high a cost. It also looks like the way a server's weight is updated could be revisited so that the write lock gets tighter at the expense of a short part of inconsistency between weights and servers still present in the tree.	2020-10-17 19:37:40 +02:00
Willy Tarreau	ae99aeb135	MINOR: lb/map: use seek lock and read locks where appropriate - map_get_server_hash() doesn't need a write lock since it only reads the array, let's only use a read lock here. - map_get_server_rr() only needs exclusivity to adjust the rr_idx while looking for its entry. Since this one is not used by map_get_server_hash(), let's turn this lock to a seek lock that doesn't block reads. With 8 threads, no significant performance difference was noticed given that lookups are usually instant with this LB algo so the lock contention is rare.	2020-10-17 19:04:27 +02:00
Willy Tarreau	cd10def825	MINOR: backend: replace the lbprm lock with an rwlock It was previously a spinlock, and it happens that a number of LB algos only lock it for lookups, without performing any modification. Let's first turn it to an rwlock and w-lock it everywhere. This is strictly identical. It was carefully checked that every HA_SPIN_LOCK() was turned to HA_RWLOCK_WRLOCK() and that HA_SPIN_UNLOCK() was turned to HA_RWLOCK_WRUNLOCK() on this lock. _INIT and _DESTROY were updated too.	2020-10-17 18:51:41 +02:00
Christopher Faulet	26a52af642	BUG/MEDIUM: lb: Always lock the server when calling server_{take,drop}_conn The server lock must be held when server_take_conn() and server_drop_conn() lbprm callback functions are called. It is a documented prerequisite but it is not always performed. It only affects leastconn and fas lb algorithm. Others don't use these callback functions. A race condition on the next pending effecive weight (next_eweight) may be encountered with the leastconn lb algorithm. An agent check may set it to 0 while fwlc_srv_reposition() is called. The server is locked during the next_eweight update. But because the server lock is not acquired when fwlc_srv_reposition() is called, we may use it to recompute the server key, leading to a division by 0. This patch must be backported as far as 1.8.	2020-10-17 09:29:43 +02:00
Christopher Faulet	db2c17da60	BUG/MEDIUM: mux-h1: Get the session from the H1S when capturing bad messages It is not guaranteed that the backend connection has an owner. It is set when the connection is created. But when the connection is moved in a server idle list, the connection owner is set to NULL and may never be set again. On the other hand, when a mux is created or when a CS is attached, the session is always defined. The H1 stream always keep a reference on it when it is created. Thus, when a bad message is captured we should not rely on the connection owner to retrieve the session. Instead we should get it from the H1 stream.	2020-10-16 19:53:17 +02:00
Christopher Faulet	2469eba20f	BUG/MEDIUM: spoe: Unset variable instead of set it if no data provided If an agent try to set a variable with the NULL data type, an unset is perform instead to avoid undefined behaviors. Once decoded, such data are translated to a sample with the type SMP_T_ANY. It is unexpected in HAProxy. When a variable is set with such sample, no data are attached to the variable. Thus, when the variable is retrieved later in the transaction, the sample data are uninitialized, leading to undefined behaviors depending on how it is used. For instance, it leads to a crash if the debug converter is used on such variable. This patch should fix the issue #855. It must be backported as far as 1.8.	2020-10-16 19:53:17 +02:00
Amaury Denoyelle	7239c24986	MEDIUM: backend: reuse connection if using a static sni Detect if the sni used a constant value and if so, allow to reuse this connection for later sessions. Use a combination of SMP_USE_INTRN + !SMP_F_VOLATILE to consider a sample as a constant value. This features has been requested on github issue #371.	2020-10-16 17:48:01 +02:00
Amaury Denoyelle	2f0a797631	MINOR: ssl: add volatile flags to ssl samples The ssl samples are not constant over time and change according to the session. Add the flag SMP_F_VOL_SESS to indicate this.	2020-10-16 17:47:29 +02:00
Fr�d�ric L�caille	baeb919177	BUG/MINOR: peers: Possible unexpected peer seesion reset after collisions. During a peers session collision (two peer sessions opened on both side) we must mark the peer the session of which will be shutdown as alive, if not ->reconnect timer will be set with a wrong value if the synchro task expires after the peer has been reconnected. This possibly leads to unexpected deconnections during handshakes. Furthermore, this patch cancels any heartbeat tranmimission when a reconnection is prepared.	2020-10-16 17:45:58 +02:00
Willy Tarreau	0aa5a5b175	BUILD: listener: avoir a build warning when threads are disabled It's just a __decl_thread() that appeared before the last variable.	2020-10-16 17:43:04 +02:00
Willy Tarreau	d48ed6643b	MEDIUM: task: use an upgradable seek lock when scanning the wait queue Right now when running a configuration with many global timers (e.g. many health checks), there is a lot of contention on the global wait queue lock because all threads queue up in front of it to scan it. With 2000 servers checked every 10 milliseconds (200k checks per second), after 23 seconds running on 8 threads, the lock stats were this high: Stats about Lock TASK_WQ: write lock : 9872564 write unlock: 9872564 (0) wait time for write : 9208.409 msec wait time for write/lock: 932.727 nsec read lock : 240367 read unlock : 240367 (0) wait time for read : 149.025 msec wait time for read/lock : 619.991 nsec i.e. ~5% of the total runtime spent waiting on this specific lock. With upgradable locks we don't need to work like this anymore. We can just try to upgade the read lock to a seek lock before scanning the queue, then upgrade the seek lock to a write lock for each element we want to delete there and immediately downgrade it to a seek lock. The benefit is double: - all other threads which need to call next_expired_task() before polling won't wait anymore since the seek lock is compatible with the read lock ; - all other threads competing on trying to grab this lock will fail on the upgrade attempt from read to seek, and will let the current lock owner finish collecting expired entries. Doing only this has reduced the wake_expired_tasks() CPU usage in a very large servers test from 2.15% to 1.04% as reported by perf top, and increased by 3% the health check rate (all threads being saturated). This is expected to help against (and possibly solve) the problem described in issue #875.	2020-10-16 17:15:54 +02:00
Willy Tarreau	3cfaa8d1e0	BUG/MEDIUM: task: bound the number of tasks picked from the wait queue at once There is a theorical problem in the wait queue, which is that with many threads, one could spend a lot of time looping on the newly expired tasks, causing a lot of contention on the global wq_lock and on the global rq_lock. This initially sounds bening, but if another thread does just a task_schedule() or task_queue(), it might end up waiting for a long time on this lock, and this wait time will count on its execution budget, degrading the end user's experience and possibly risking to trigger the watchdog if that lasts too long. The simplest (and backportable) solution here consists in bounding the number of expired tasks that may be picked from the global wait queue at once by a thread, given that all other ones will do it as well anyway. We don't need to pick more than global.tune.runqueue_depth tasks at once as we won't process more, so this counter is updated for both the local and the global queues: threads with more local expired tasks will pick less global tasks and conversely, keeping the load balanced between all threads. This will guarantee a much lower latency if/when wakeup storms happen (e.g. hundreds of thousands of synchronized health checks). Note that some crashes have been witnessed with 1/4 of the threads in wake_expired_tasks() and, while the issue might or might not be related, not having reasonable bounds here definitely justifies why we can spend so much time there. This patch should be backported, probably as far as 2.0 (maybe with some adaptations).	2020-10-16 15:18:48 +02:00
Willy Tarreau	ba29687bc1	BUG/MEDIUM: proxy: properly stop backends The proxy stopping mechanism was changed with commit `322b9b94e` ("MEDIUM: proxy: make stop_proxy() now use stop_listener()") so that it's now entirely driven by the listeners. One thing was forgotten though, which is that pure backends will not stop anymore since they don't have any listener, and that it's necessary to stop them in order to stop the health checks. No backport is needed.	2020-10-16 15:16:17 +02:00
Willy Tarreau	233ad288cd	CLEANUP: protocol: remove the now unused <handler> field of proto_fam->bind() We don't need to specify the handler anymore since it's set in the receiver. Let's remove this argument from the function and clean up the remains of code that were still setting it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	a74cb38e7c	MINOR: protocol: register the receiver's I/O handler and not the protocol's Now we define a new sock_accept_iocb() for socket-based stream protocols and use it as a wrapper for listener_accept() which now takes a listener and not an FD anymore. This will allow the receiver's I/O cb to be redefined during registration, and more specifically to get rid of the hard-coded hacks in protocol_bind_all() made for syslog. The previous ->accept() callback in the protocol was removed since it doesn't have anything to do with accept() anymore but is more generic. A few places where listener_accept() was compared against the FD's IO callback for debugging purposes on the CLI were updated.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e140a6921f	MINOR: log: set the UDP receiver's I/O handler in the receiver The I/O handler is syslog_fd_handler(), let's set it when creating the receivers.	2020-10-15 21:47:56 +02:00
Willy Tarreau	d2fb99f9d5	MINOR: protocol: add a default I/O callback and put it into the receiver For now we're still using the protocol's default accept() function as the I/O callback registered by the receiver into the poller. While this is usable for most TCP connections where a listener is needed, this is not suitable for UDP where a different handler is needed. Let's make this configurable in the receiver just like the upper layer is configurable for listeners. In order to ease stream protocols handling, the protocols will now provide a default I/O callback which will be preset into the receivers upon allocation so that almost none of them has to deal with it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	caa91de718	MEDIUM: listener: remove the second pass of fd manipulation at the end The receiver FDs must not be manipulated by the listener_accept() function anymore, it must exclusively rely on the job performed by its listeners, as it is also the only way to keep the receivers working for established connections regardless of the listener's state (typically for multiplexed protocols like QUIC). This used to be necessary when the FDs were adjusted at once only but now that fd_done() is gone and the need for polling enabled by the accept_conn() function which detects the EAGAIN, we have nothing to do there to fixup any possible previous bad decision anymore. Interestingly, as a side effect of making the code not depend on the FD anymore, it also removes the need for a second lock, which increase the accept rate by about 1% on 8 threads.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9378bbe0be	MEDIUM: listener: use protocol->accept_conn() to accept a connection Now listener_accept() doesn't have to deal with the incoming FD anymore (except for a little bit of side band stuff). It directly retrieves a valid connection from the protocol layer, or receives a well-defined error code that helps it decide how to proceed. This removes a lot of hardly maintainable low-level code and opens the function to receive new protocol stacks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	344b8fcf87	MINOR: sockpair: implement sockpair_accept_conn() to accept a connection This is the same as previous commit, but this time for the sockpair- specific stuff, relying on recv_fd_uxst() instead of accept(), so the code is simpler. The various errno cases are handled like for regular sockets, though some of them will probably never happen, but this does not hurt.	2020-10-15 21:47:56 +02:00
Willy Tarreau	f1dc9f2f17	MINOR: sock: implement sock_accept_conn() to accept a connection The socket-specific accept() code in listener_accept() has nothing to do there. Let's move it to sock.c where it can be significantly cleaned up. It will now directly return an accepted connection and provide a status code instead of letting listener_accept() deal with various errno values. Note that this doesn't support the sockpair specific code. The function is now responsible for dealing with its own receiver's polling state and calling fd_cant_recv() when facing EAGAIN. One tiny change from the previous implementation is that the connection's sockaddr is now allocated before trying accept(), which saves a memcpy() of the resulting address for each accept at the expense of a cheap pool_alloc/pool_free on the final accept returning EAGAIN. This still apparently slightly improves accept performance in microbencharks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	7d053e4211	MINOR: sock: rename sock_accept_conn() to sock_accepting_conn() This call was introduced by commit `5ced3e887` ("MINOR: sock: add sock_accept_conn() to test a listening socket") but is actually quite confusing because it makes one think the socket will accept a connection (which is what we want to have in a new function) while it only tells whether it's configured to accept connections. Let's call it sock_accepting_conn() instead. The same change was applied to sockpair which had the same issue.	2020-10-15 21:47:56 +02:00
Willy Tarreau	01ca149047	MINOR: session: simplify error path in session_accept_fd() Now that this function is always called with an initialized connection and that the control layer is always initialized, we don't need to play games with fdtab[] to decide how to close, we can simply rely on the regular close path using conn_ctrl_close(), which can be fused with conn_xprt_close() into conn_full_close(). The code is cleaner because the FD is now used only for some protocol-specific setup (that will eventually have to move) and to try to send a hard-coded HTTP 500 error message on raw sockets.	2020-10-15 21:47:56 +02:00
Willy Tarreau	83efc320aa	MEDIUM: listener: allocate the connection before queuing a new connection Till now we would keep a per-thread queue of pending incoming connections for which we would store: - the listener - the accepted FD - the source address - the source address' length And these elements were first used in session_accept_fd() running on the target thread to allocate a connection and duplicate them again. Doing this induces various problems. The first one is that session_accept_fd() may only run on file descriptors and cannot be reused for QUIC. The second issue is that it induces lots of memory copies and that the listerner queue thrashes a lot of cache, consuming 64 bytes per entry. This patch changes this by allocating the connection before queueing it, and by only placing the connection's pointer into the queue. Indeed, the first two calls used to initialize the connection already store all the information above, which can be retrieved from the connection pointer alone. So we just have to pop one pointer from the target thread, and pass it to session_accept_fd() which only needs the FD for the final settings. This starts to make the accept path a bit more transport-agnostic, and saves memory and CPU cycles at the same time (1% connection rate increase was noticed with 4 threads). Thanks to dividing the accept-queue entry size from 64 to 8 bytes, its size could be increased from 256 to 1024 connections while still dividing the overall size by two. No single queue full condition was met. One minor drawback is that connection may be allocated from one thread's pool to be used into another one. But this already happens a lot with connection reuse so there is really nothing new here.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9b7587a6af	MINOR: connection: make sockaddr_alloc() take the address to be copied Roughly half of the calls to sockadr_alloc() are made to copy an already known address. Let's optionally pass it in argument so that the function can handle the copy at the same time, this slightly simplifies its usage.	2020-10-15 21:47:56 +02:00
Willy Tarreau	0138f51f93	CLEANUP: fd: finally get rid of fd_done_recv() fd_done_recv() used to be useful with the FD cache because it used to allow to keep a file descriptor active in the poller without being marked as ready in the cache, saving it from ringing immediately, without incurring any system call. It was a way to make it yield to wait for new events leaving a bit of time for others. The only user left was the connection accepter (listen_accept()). We used to suspect that with the FD cache removal it had become totally useless since changing its readiness or not wouldn't change its status regarding the poller itself, which would be the only one deciding to report it again. Careful tests showed that it indeed has exactly zero effect nowadays, the syscall numbers are exactly the same with and without, including when enabling edge-triggered polling. Given that there's no more API available to manipulate it and that it was directly called as an optimization from listener_accept(), it's about time to remove it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e53e7ec9d9	CLEANUP: protocol: remove the ->drain() function No protocol defines it anymore. The last user used to be the monitor-net stuff that got partially broken already when the tcp_drain() function moved to conn_sock_drain() with commit `e215bba95` ("MINOR: connection: make conn_sock_drain() work for all socket families") in 1.9-dev2. A part of this will surely move back later when non-socket connections arrive with QUIC but better keep the API clean and implement what's needed in time instead.	2020-10-15 21:47:04 +02:00
Willy Tarreau	9e9919dd8b	MEDIUM: proxy: remove obsolete "monitor-net" As discussed here during 2.1-dev, "monitor-net" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200 if ..." and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Willy Tarreau	77e0daef9f	MEDIUM: proxy: remove obsolete "mode health" As discussed here during 2.1-dev, "mode health" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, doesn't support source filtering, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200" and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Amaury Denoyelle	46f041d7f8	MEDIUM: fcgi: remove conn from session on detach FCGI mux is marked with HOL blocking. On safe reuse mode, the connection using it are placed on the sessions instead of the available lists to avoid sharing it with several clients. On detach, if they are no more streams, remove the connection from the session before adding it to the idle list. If there is still used streams, do not add it to available list as it should be already on the session list.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	6b8daef56b	MEDIUM: h2: remove conn from session on detach H2 mux is marked with HOL blocking. On safe reuse mode, the connection using it are placed on the sessions instead of the available lists to avoid sharing it with several clients. On detach, if they are no more streams, remove the connection from the session before adding it to the idle list. If there is still used streams, do not add it to available list as it should be already on the session list.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	0d21deaded	MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking If a connection is using a mux protocol subject to HOL blocking, add it to the session instead of the available list to avoid sharing it with other clients on connection reuse.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	00464ab8f4	MEDIUM: backend: add new conn to session if mux marked as HOL blocking When allocating a new session on connect_server, if the mux protocol is marked as subject of HOL blocking, add it into session instead of available list to avoid sharing it with other clients.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	3d3c0918dc	MINOR: mux/connection: add a new mux flag for HOL risk This flag is used to indicate if the mux protocol is subject to head-of-line blocking problem.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	9c13b62b47	BUG/MEDIUM: connection: fix srv idle count on conn takeover On server connection migration from one thread to another, the wrong idle thread-specific counter is decremented. This bug was introduced since commit `3d52f0f1f8` due to the factorization with srv_use_idle_conn. However, this statement is only executed from conn_backend_get. Extract the decrement from srv_use_idle_conn in conn_backend_get and use the correct thread-specific counter. Rename the function to srv_use_conn to better reflect its purpose as it is also used with a newly initialized connection not in the idle list. As a side change, the connection insertion to available list has also been extracted to conn_backend_get. This will be useful to be able to specify an alternative list for protocol subject to HOL risk that should not be shared between several clients. This bug is only present in this release and thus do not need a backport.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	5f1ded5629	BUG/MINOR: connection: fix loop iter on connection takeover The loop always missed one iteration due to the incrementation done on the for check. Move the incrementation on the loop last statement to fix this behaviour. This bug has a very limited impact, not at all visible to the user, but could be backported to 2.2.	2020-10-15 15:19:25 +02:00
Willy Tarreau	1a3770cbc7	BUG/MEDIUM: deinit: check fdtab before fdtab[fd].owner When running a pure config check (haproxy -c) we go through the deinit phase without having allocated fdtab, so we can't blindly dereference it. The issue was added by recent commit `ae7bc4a23` ("MEDIUM: deinit: close all receivers/listeners before scanning proxies"), no backport is needed.	2020-10-14 12:13:51 +02:00
Willy Tarreau	2f6f362756	CLEANUP: protocol: intitialize all of the sockaddr when disconnecting In issue #894, Coverity suspects uninitialized values for a socket's address whose family is AF_UNSPEC but it doesn't know that the address is not used in this case. It's not on a critical path and working around it is trivial, let's fully declare the address. We're doing it for both TCP and UDP, because the same principle appears at two places.	2020-10-14 10:54:15 +02:00
Willy Tarreau	258b351704	BUG/MINOR: listener: detect and handle shared sockets stopped in other processes It may happen that during a temporary listener pause resulting from a SIGTTOU, one process gets one of its sockets disabled by another process and will not be able to recover from this situation by itself. For the protocols supporting this (TCPv4 and TCPv6 at the moment) this situation is detectable, so when this happens, let's put the listener into the PAUSED state so that it remains consistent with the real socket state. One nice effect is that just sending the SIGTTIN signal to the process is enough to recover the socket in this case. There is no need to backport this, this behavior has been there forever and the fix requires to reimplement the getsockopt() call there.	2020-10-13 18:15:33 +02:00
Willy Tarreau	85d2ba6b78	CLEANUP: unix: make use of sock_accept_conn() where relevant This allows to get rid of one getsockopt(SO_ACCEPTCONN) in the binding code.	2020-10-13 18:15:33 +02:00
Willy Tarreau	3e12de2cc6	CLEANUP: tcp: make use of sock_accept_conn() where relevant This allows to get rid of two getsockopt(SO_ACCEPTCONN).	2020-10-13 18:15:33 +02:00
Willy Tarreau	cc8b653483	MINOR: sockpair: implement the .rx_listening function For socket pairs we don't rely on a real listening socket but we need to have a properly connected UNIX stream socket. This is what the new sockpair_accept_conn() tries to report. Some corner cases like half shutdown will still not be detected but that should be sufficient for most cases we really care about.	2020-10-13 18:15:33 +02:00
Willy Tarreau	29185140db	MINOR: protocol: make proto_tcp & proto_uxst report listening sockets Now we introdce a new .rx_listening() function to report if a receiver is actually a listening socket. The reason for this is to help detect shared sockets that might have been broken by sibling processes.	2020-10-13 18:15:33 +02:00
Willy Tarreau	5ced3e8879	MINOR: sock: add sock_accept_conn() to test a listening socket At several places we need to check if a socket is still valid and still willing to accept connections. Instead of open-coding this, each time, let's add a new function for this.	2020-10-13 18:15:33 +02:00
Willy Tarreau	8b6fc3d10e	MINOR: proto-tcp: make use of connect(AF_UNSPEC) for the pause Currently the suspend/resume mechanism for listeners only works on Linux and we resort to a number of tricks involving shutdown+listen+shutdown to try to detect failures on other operating systems that do not support it. But on Linux connect(AF_UNSPEC) also works pretty well and is much cleaner. It still doesn't work on other operating systems but the error is easier to detect and appears safer. So let's switch to this.	2020-10-13 18:15:33 +02:00
Willy Tarreau	7c9f756dcc	MINOR: fd: report an error message when failing initial allocations When starting with a huge maxconn (say 1 billion), the only error seen is "No polling mechanism available". This doesn't help at all to resolve the problem. Let's add specific alerts for the failed mallocs. Now we can get this instead: [ALERT] 286/154439 (23408) : Not enough memory to allocate 2000000033 entries for fdtab! This may be backported as far as 2.0 as it helps debugging bad configurations.	2020-10-13 18:15:33 +02:00
Willy Tarreau	b1e600c9c5	BUG/MINOR: mux-h2: do not stop outgoing connections on stopping There are reports of a few "SC" in logs during reloads when H2 is used on the backend side. Christopher analysed this as being caused by the proxy disabled test in h2_process(). As the comment says, this was done for frontends only, and must absolutely not send a GOAWAY to the backend, as all it will result in is to make newly queued streams fail. The fix consists in simply testing the connection side before deciding to send the GOAWAY. This may be backported as far as 2.0, though for whatever reason it seems to manifest itself only since 2.2 (probably due to changes in the outgoing connection setup sequence).	2020-10-13 18:15:33 +02:00
Willy Tarreau	2bd0f8147b	BUG/MINOR: init: only keep rlim_fd_cur if max is unlimited On some operating systems, RLIM_INFINITY is set to -1 so that when the hard limit on the number of FDs is set to unlimited, taking the MAX of both values keeps rlim_fd_cur and everything works. But on other systems this values is defined as the highest positive integer. This is what was observed on a 32-bit AIX 5.1. The effect is that maxsock becomes 2^31-1 and that fdtab allocation fails. Note that a simple workaround consists in manually setting maxconn in the global section. Let's ignore unlimited as soon as we retrieve rlim_fd_max so that all systems behave consistently. This may be backported as far as 2.0, though it doesn't seem like it has annoyed anyone.	2020-10-13 15:36:08 +02:00
Fr�d�ric L�caille	3fc0fe05fd	MINOR: peers: heartbeat, collisions and handshake information for "show peers" command. This patch adds "coll" new counter and the heartbeat timer values to "show peers" command. It also adds the elapsed time since the last handshake to new "last_hdshk" new peer dump field.	2020-10-09 20:59:58 +02:00
Willy Tarreau	0a002df2c2	BUG/MINOR: proxy: respect the proper format string in sig_pause/sig_listen When factoring out the pause/resume error messages in commit `775e00158` ("MAJOR: signals: use protocol_pause_all() and protocol_resume_all()") I forgot that ha_warning() and send_log() take a format string and not just a const string. No backport is needed, this is 2.3-dev.	2020-10-09 19:26:27 +02:00
Willy Tarreau	ccf429960b	MEDIUM: config: remove the deprecated and dangerous global "debug" directive This one was scheduled for removal in 2.3 since 2.2-dev3 by commit `1b85785bc` ("MINOR: config: mark global.debug as deprecated"). Let's remove it now. It remains totally possible to use -d on the command line though.	2020-10-09 19:18:45 +02:00
Willy Tarreau	ab0a5192a8	MEDIUM: config: mark "grace" as deprecated This was introduced 15 years ago or so to delay the stopping of some services so that a monitoring device could detect its port being down before services were stopped. Since then, clean reloads were implemented and this doesn't cope well with reload at all, preventing the new process from seamlessly binding, and forcing processes to coexist with half-baked configurations. Now it has become a real problem because there's a significant code portion in the proxies that is solely dedicated to this obsolete feature, and dealing with its special cases eases the introduction of bugs in other places so it's about time that it goes. We could tentatively schedule its removal for 2.4 with a hard deadline for 2.5 in any case.	2020-10-09 19:07:01 +02:00
Willy Tarreau	e03204c8e1	MEDIUM: listeners: implement protocol level ->suspend/resume() calls Now we have ->suspend() and ->resume() for listeners at the protocol level. This means that it now becomes possible for a protocol to redefine its own way to suspend and resume. The default functions are provided for TCP, UDP and unix, and they are pass-through to the receiver equivalent as it used to be till now. Nothing was defined for sockpair since it does not need to suspend/resume during reloads, hence it will succeed.	2020-10-09 18:44:37 +02:00
Willy Tarreau	7b2febde1d	MINOR: listeners: split do_unbind_listener() in two The inner part now goes into the protocol and is used to decide how to unbind a given protocol's listener. The existing code which is able to also unbind the receiver was provided as a default function that we currently use everywhere. Some complex listeners like QUIC will use this to decide how to unbind without impacting existing connections, possibly by setting up other incoming paths for the traffic.	2020-10-09 18:44:37 +02:00
Willy Tarreau	f58b8db47b	MEDIUM: receivers: add an rx_unbind() method in the protocols This is used as a generic way to unbind a receiver at the end of do_unbind_listener(). This allows to considerably simplify that function since we can now let the protocol perform the cleanup. The generic code was moved to sock.c, along with the conditional rx_disable() call. Now the code also supports that the ->disable() function of the protocol which acts on the listener performs the close itself and adjusts the RX_F_BUOND flag accordingly.	2020-10-09 18:44:36 +02:00
Willy Tarreau	18c20d28d7	MINOR: listeners: move the LI_O_MWORKER flag to the receiver This listener flag indicates whether the receiver part of the listener is specific to the master or to the workers. In practice it's only used by the master's CLI right now. It's used to know whether or not the FD must be closed before forking the workers. For this reason it's way more of a receiver's property than a listener's property, so let's move it there under the name RX_F_MWORKER. The rest of the code remains unchanged.	2020-10-09 18:43:05 +02:00
Willy Tarreau	75c98d166e	CLEANUP: listeners: remove the do_close argument to unbind_listener() And also remove it from its callers. This subtle distinction was added as sort of a hack for the seamless reload feature but is not needed anymore since the do_close turned unused since commit previous commit ("MEDIUM: listener: let do_unbind_listener() decide whether to close or not"). This also removes the unbind_listener_no_close() function.	2020-10-09 18:41:56 +02:00
Willy Tarreau	374e9af358	MEDIUM: listener: let do_unbind_listener() decide whether to close or not The listener contains all the information needed to decide to close on unbind or not. The rule is the following (when we're not stopping): - worker process unbinding from a worker's FD with socket transfer enabled => keep - master process unbinding from a master's inherited FD => keep - master process unbinding from a master's FD => close - master process unbinding from a worker's FD => close - worker process unbinding from a master's FD => close - worker process unbinding from a worker's FD => close Let's translate that into the function and stop using the do_close argument that is a bit obscure for callers. It was not yet removed to ease code testing.	2020-10-09 18:41:48 +02:00
Willy Tarreau	87acd4e848	BROKEN/MEDIUM: listeners: rework the unbind logic to make it idempotent BROKEN: the failure rate on reg-tests/seamless-reload/abns_socket.vtc has significantly increased for no obvious reason. It fails 99% of the time vs 10% before. do_unbind_listener() is not logical and is not even idempotent. It must not touch the fd if already -1, which also means not touch the receiver. In addition, when performing a partial stop on a socket (not closing), we know the socket remains in the listening state yet it's marked as LI_ASSIGNED, which is confusing as it doesn't translate its real state. With this change, we make sure that FDs marked for close end up in ASSIGNED state and that those which are really bound and on which a listen() was made (i.e. not pause) remain in LISTEN state. This is what is closest to reality. Ideally this function should become a default proto->unbind() one but it may still keep a bit too much state logic to become generalized to other protocols (e.g. QUIC).	2020-10-09 18:29:04 +02:00
Willy Tarreau	d6afb53bdc	MEDIUM: listeners: always close master vs worker listeners Right now in enable_listener(), we used to start all enabled listeners then kill from the workers those that were for the master. But this is incomplete. We must also close from the master the listeners that are solely for workers, and do it before we even start them. Otherwise we end up with a master responding to the worker CLI connections if the listener remains in listen mode to translate the socket's real state. It doesn't seem like it could have caused bugs in the past because we used to aggressively mark disabled listeners as LI_ASSIGNED despite the fact that they were still bound and listening. If this patch were ever seen as a candidate solution for any obscure bug, be careful in that it subtly relies on the fact that fd_delete() doesn't close inherited FDs anymore, otherwise that could break the master's ability to pass inherited FDs on reloads.	2020-10-09 18:29:04 +02:00
Willy Tarreau	95a3460739	MINOR: listener: add a few BUG_ON() statements to detect inconsistencies We must not have an fd==-1 when switching to certain states. This will later disappear but for now it helps detecting inconsistencies.	2020-10-09 18:29:04 +02:00
Willy Tarreau	e122dc5316	MEDIUM: udp: implement udp_suspend() and udp_resume() In Linux kernel's net/ipv4/udp.c there's a udp_disconnect() function which is called when connecting to AF_UNSPEC, and which unhashes a "connection". This property, which is also documented in connect(2) both in Linux and Open Group's man pages for datagrams, is interesting because it allows to reverse a connect() which is in fact a filter on the source. As such we can suspend a receiver by making it connect to itself, which will cause it not to receive any traffic anymore, letting a new one receive it all, then resume it by breaking this connection. This was tested to work well on Linux, other operating systems should also be tested. Before this, sending a SIGTTOU to a process having a UDP syslog forwarder would cause this error: [WARNING] 280/194249 (3268) : Paused frontend GLOBAL. [WARNING] 280/194249 (3268) : Some proxies refused to pause, performing soft stop now. [WARNING] 280/194249 (3268) : Proxy GLOBAL stopped (cumulated conns: FE: 0, BE: 0). [WARNING] 280/194249 (3268) : Proxy sylog-loadb stopped (cumulated conns: FE: 0, BE: 0). With this change, it now proceeds just like with TCP listeners: [WARNING] 280/195503 (3885) : Paused frontend GLOBAL. [WARNING] 280/195503 (3885) : Paused frontend sylog-loadb. And SIGTTIN also works: [WARNING] 280/195507 (3885) : Resumed frontend GLOBAL. [WARNING] 280/195507 (3885) : Resumed frontend sylog-loadb. On Linux this also works with TCP listeners (which can then be resumed using listen()) and established TCP sockets (which we currently kill using setsockopt(so_linger)), both not being portable on other OSes. UNIX sockets and ABNS sockets do not support it however (connect always fails). This needs to be further explored to see if other OSes might benefit from this to perform portable and reliable resets particularly on the backend side.	2020-10-09 18:29:04 +02:00
Willy Tarreau	626f3a7beb	MEDIUM: proxy: make soft_stop() stop most listeners using protocol_stop_now() One difficulty in soft-stopping is to make sure not to forget unlisted listeners. By first doing a pass using protocol_stop_now() we catch the vast majority of them. The few remaining ones are the ones belonging to a proxy having a grace period. For these ones, the proxy will arm its stop_time timer and emit a log message. Since neither UDP listeners nor peers use the grace period, we can already get rid of the special cases there since we know they will have been stopped by the protocols.	2020-10-09 18:29:04 +02:00
Willy Tarreau	02e8557e88	MINOR: protocol: add protocol_stop_now() to instant-stop listeners This will instantly stop all listeners except those which belong to a proxy configured with a grace time. This means that UDP listeners, and peers will also be stopped when called this way.	2020-10-09 18:29:04 +02:00
Willy Tarreau	acde152175	MEDIUM: proxy: centralize proxy status update and reporting There are multiple ways a proxy may switch to the disabled state, but now it's essentially once it loses its last listener. Instead of keeping duplicate code around and reporting the state change before actually seeing it, we now report it at the moment it's performed (from the last listener leaving) which allows to remove the message from all other places.	2020-10-09 18:29:04 +02:00
Willy Tarreau	a389c9e1e3	MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends For now we cannot easily distinguish a peers frontend from another one, which will be problematic to avoid reporting them when stopping their listeners. Let's add PR_MODE_PEERS for this. It's not supposed to cause any issue since all non-HTTP proxies are handled similarly now.	2020-10-09 18:28:21 +02:00
Willy Tarreau	322b9b94e9	MEDIUM: proxy: make stop_proxy() now use stop_listener() The function will stop the listeners using this method, which in turn will ping back once it finishes disabling the proxy.	2020-10-09 18:28:18 +02:00
Willy Tarreau	caa7df1296	MINOR: listeners: add a new stop_listener() function This function will be used to definitely stop a listener (e.g. during a soft_stop). This is actually tricky because it may be called for a proxy or for a protocol, both of which require locks and already hold some. The function takes booleans indicating which ones are already held, hoping this will be enough. It's not well defined wether proto->disable() and proto->rx_disable() are supposed to be called with any lock held, and they are used from do_unbind_listener() with all these locks. Some back annotations ought to be added on this point. The proxy's listeners count is updated, and the proxy is marked as disabled and woken up after the last one is gone. Note that a listener in listen state is already not attached anymore since it was disabled.	2020-10-09 18:27:48 +02:00
Willy Tarreau	455585e3cd	MINOR: listeners: count unstoppable jobs on creation, not deletion We have to count unstoppable jobs which correspond to worker sockpairs, in order to know when to count. However the way it's currently done is quite awkward because these are counted when stopping making the stop mechanism non-idempotent. This is definitely something we want to fix before stopping by protocol or our listeners count will quickly go wrong. Now they are counted when the listeners are created.	2020-10-09 18:25:14 +02:00
Willy Tarreau	b4c083f5bf	MINOR: listeners: split delete_listener() in two versions We'll need an already locked variant of this function so let's make __delete_listener() which will be called with the protocol lock held and the listener's lock held.	2020-10-09 11:27:30 +02:00
Willy Tarreau	4b51f42899	MEDIUM: listeners: now use the listener's ->enable/disable At each place we used to manipulate the FDs directly we can now call the listener protocol's enable/disable/rx_enable/rx_disable depending on whether the state changes on the listener or the receiver. One exception currently remains in listener_accept() which is a bit special and which should be split into 2 or 3 parts in the various protocol layers. The test of fd_updt in do_unbind_listener() that was added by commit `a51885621` ("BUG/MEDIUM: listeners: Don't call fd_stop_recv() if fd_updt is NULL.") could finally be removed since that part is correctly handled in the low-level disable() function. One disable() was added in resume_listener() before switching to LI_FULL because rx_resume() enables polling on the FD for the receiver while we want to disable it if the listener is full. There are different ways to clean this up in the future. One of them could be to consider that TCP receivers only act at the listener level. But in fact it does not translate reality. The reality is that only the receiver is paused and that the listener's state ought not be affected here. Ultimately the resume_listener() function should be split so that the part controlled by the protocols only acts on the receiver, and that the receiver itself notifies the upper listener about the change so that the listener protocol may decide to disable or enable polling. Conversely the listener should automatically update its receiver when they share the same state. Since there is no harm proceeding like this, let's keep this for now.	2020-10-09 11:27:30 +02:00
Willy Tarreau	5ddf1ce9c4	MINOR: protocol: add a new pair of enable/disable methods for listeners These methods will be used to enable/disable accepting new connections so that listeners do not play with FD directly anymore. Since all the currently supported protocols work on socket for now, these are identical to the rx_enable/rx_disable functions. However they were not defined in sock.c since it's likely that some will quickly start to differ. At the moment they're not used. We have to take care of fd_updt before calling fd_{want,stop}_recv() because it's allocated fairly late in the boot process and some such functions may be called very early (e.g. to stop a disabled frontend's listeners).	2020-10-09 11:27:30 +02:00
Willy Tarreau	686fa3db50	MINOR: protocol: add a new pair of rx_enable/rx_disable methods These methods will be used to enable/disable rx at the receiver level so that callers don't play with FDs directly anymore. All our protocols use the generic ones from sock.c at the moment. For now they're not used.	2020-10-09 11:27:30 +02:00
Willy Tarreau	e70c7977f2	MINOR: sock: provide a set of generic enable/disable functions These will be used on receivers, to enable or disable receiving on a listener, which most of the time just consists in enabling/disabling the file descriptor. We have to take care of the existence of fd_updt to know if we may or not call fd_{want,stop}_recv() since it's not permitted in very early boot.	2020-10-09 11:27:30 +02:00
Willy Tarreau	010fe151ce	MINOR: listener: use the protocol's ->rx_resume() method when available Instead of calling listen() for IPPROTO_TCP in resume_listener(), let's call the protocol's ->rx_resume() method when defined, which does the same. This removes another hard-dependency on the fd and underlying protocol from the generic functions.	2020-10-09 11:27:30 +02:00
Willy Tarreau	58e6b71bb0	MINOR: protocol: implement an ->rx_resume() method This one undoes ->rx_suspend(), it tries to restore an operational socket. It was only implemented for TCP since it's the only one we support right now.	2020-10-09 11:27:30 +02:00
Willy Tarreau	cb66ea60cf	MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver) The ->pause method is inappropriate since it doesn't exactly "pause" a listener but rather temporarily disables it so that it's not visible at all to let another process take its place. The term "suspend" is more suitable, since the "pause" is actually what we'll need to apply to the FULL and LIMITED states which really need to make a pause in the accept process. And it goes well with the use of the "resume" function that will also need to be made per-protocol. Let's rename the function and make it act on the receiver since it's already what it essentially does, hence the prefix "_rx" to make it more explicit. The protocol struct was a bit reordered because it was becoming a real mess between the parts related to the listeners and those for the receivers.	2020-10-09 11:27:30 +02:00
Willy Tarreau	d7f331c8b8	MINOR: protocol: rename the ->listeners field to ->receivers Since the listeners were split into receiver+listener, this field ought to have been renamed because it's confusing. It really links receivers and not listeners, as most of the time it's used via rx.proto_list! The nb_listeners field was updated accordingly.	2020-10-09 11:27:30 +02:00
Willy Tarreau	dae0692717	CLEANUP: listeners: remove the now unused enable_all_listeners() It's not used anymore since previous commit. The good thing is that no more listener function now directly acts on a protocol.	2020-10-09 11:27:30 +02:00
Willy Tarreau	078e1c7102	CLEANUP: protocol: remove the ->enable_all method It's not used anymore, now the listeners are enabled from protocol_enable_all().	2020-10-09 11:27:30 +02:00
Willy Tarreau	5b95ae6b32	MINOR: protocol: directly call enable_listener() from protocol_enable_all() protocol_enable_all() calls proto->enable_all() for all protocols, which is always equal to enable_all_listeners() which in turn simply is a generic loop calling enable_listener() always returning ERR_NONE. Let's clean this madness by first calling enable_listener() directly from protocol_enable_all().	2020-10-09 11:27:30 +02:00
Willy Tarreau	7834a3f70f	MINOR: listeners: export enable_listener() we'll soon call it from outside.	2020-10-09 11:27:30 +02:00
Willy Tarreau	d008009958	CLEANUP: listeners: remove unused disable_listener and disable_all_listeners These ones have never been called, they were referenced by the protocol's disable_all for some protocols but there are no traces of their use, so in addition to not being sure the code works, it has never been tested. Let's remove a bit of complexity starting from there.	2020-10-09 11:27:30 +02:00
Willy Tarreau	fb4ead8e8a	CLEANUP: protocol: remove the ->disable_all method This one has never been used, is only referenced by proto_uxst and proto_sockpair, and it's not even certain it works at all. Let's get rid of it.	2020-10-09 11:27:30 +02:00
Willy Tarreau	e53608b2cd	MINOR: listeners: move fd_stop_recv() to the receiver's socket code fd_stop_recv() has nothing to do in the generic listener code, it's per protocol as some don't need it. For instance with abns@ it could even lead to fd_stop_recv(-1). And later with QUIC we don't want to touch the fd at all! It used to be that since commit `f2cb169487` delegating fd manipulation to their respective threads it wasn't possible to call it down there but it's not the case anymore, so let's perform the action in the protocol-specific code.	2020-10-09 11:27:30 +02:00
Willy Tarreau	fb76bd5ca6	BUG/MEDIUM: listeners: correctly report pause() errors By using the same "ret" variable in the "if" block to test the return value of pause(), the second one shadows the first one and when forcing the result to zero in case of an error, it doesn't do anything. The problem is that some listeners used to fail to pause in multi-process mode and this was not reported, but their failure was automatically resolved by the last process to pause. By properly checking for errors we might now possibly report a race once in a while so we may have to roll this back later if some users meet it. The test on ==0 is wrong too since technically speaking a total stop validates the need for a pause, but stops the listener so it's just the resume that won't work anymore. We could switch to stopped but it's an involuntary switch and the user will not know. Better then mark it as paused and let the resume continue to fail so that only the resume will eventually report an error (e.g. abns@). This must not be backported as there is a risk of side effect by fixing this bug, given that it hides other bugs itself.	2020-10-09 11:27:30 +02:00
Willy Tarreau	91c614dd0e	MEDIUM: proto_tcp: make the pause() more robust in multi-process In multi-process, the TCP pause is very brittle and we never noticed it because the error was lost in the upper layers. The problem is that shutdown() may fail if another process already did it, and will cause a process to fail to pause. What we do here in case of error is that we double-check the socket's state to verify if it's still accepting connections, and if not, we can conclude that another process already did the job in parallel. The difficulty here is that we're trying to eliminate false positives where some OSes will silently report a success on shutdown() while they don't shut the socket down, hence this dance of shutw/listen/shutr that only keeps the compatible ones. Probably that a new approach relying on connect(AF_UNSPEC) would provide better results.	2020-10-09 11:27:30 +02:00
Willy Tarreau	1accacbcc3	CLEANUP: proxy: remove the now unused pause_proxies() and resume_proxies() They're not used anymore, delete them before someone thinks about using them again!	2020-10-09 11:27:30 +02:00
Willy Tarreau	775e00158a	MAJOR: signals: use protocol_pause_all() and protocol_resume_all() When temporarily pausing the listeners with SIG_TTOU, we now pause all listeners via the protocols instead of the proxies. This has the benefits that listeners are paused regardless of whether or not they belong to a visible proxy. And for resuming via SIG_TTIN we do the same, which allows to report binding conflicts and address them, since the operation can be repeated on a per-listener basis instead of a per-proxy basis. While in appearance all cases were properly handled, it's impossible to completely rule out the possibility that something broken used to work by luck due to the scan ordering which is naturally different, hence the major tag.	2020-10-09 11:27:30 +02:00
Willy Tarreau	09819d1118	MINOR: protocol: introduce protocol_{pause,resume}_all() These two functions are used to pause and resume all listeners of all protocols. They use the standard listener functions for this so they're supposed to handle the situation gracefully regardless of the upper proxies' states, and they will report completion on proxies once the switch is performed. It might be nice to define a particular "failed" state for listeners that cannot resume and to count them on proxies in order to mention that they're definitely stuck. On the other hand, the current situation is retryable which is quite appreciable as well.	2020-10-09 11:27:30 +02:00
Willy Tarreau	58651b42fc	MEDIUM: listener/proxy: make the listeners notify about proxy pause/resume Till now, we used to call pause_proxy()/resume_proxy() to enable/disable processing on a proxy, which is used during soft reloads. But since we want to drive this process from the listeners themselves, we have to instead proceed the other way around so that when we enable/disable a listener, it checks if it changed anything for the proxy and notifies about updates at this level. The detection is made using li_ready=0 for pause(), and li_paused=0 for resume(). Note that we must not include any test for li_bound because this state is seen by processes which share the listener with another one and which must not act on it since the other process will do it. As such the socket behind the FD will automatically be paused and resume without its local state changing, but this is the limit of a multi-process system with shared listeners.	2020-10-09 11:27:30 +02:00
Willy Tarreau	5d7f9ce831	MINOR: listeners: check the current listener earlier state in resume_listener() It's quite confusing to have the test on LI_READY very low in the function as it should be made much earlier. Just like with previous commit, let's do it when entering. The additional states, however (limited, full) continue to go through the whole function.	2020-10-09 11:27:30 +02:00
Willy Tarreau	9b3a932777	MINOR: listeners: check the current listener state in pause_listener() It's better not to try to perform pause() actions on wrong states, so let's check this and make sure that all callers are now safe. This means that we must not try to pause a listener which is already paused (e.g. it could possibly fail if the pause operation isn't idempotent at the socket level), nor should we try it on earlier states.	2020-10-09 11:27:30 +02:00
Willy Tarreau	337c835d16	MEDIUM: proxy: merge zombify_proxy() with stop_proxy() The two functions don't need to be distinguished anymore since they have all the necessary info to act as needed on their listeners. Let's just pass via stop_proxy() and make it check for each listener which one to close or not.	2020-10-09 11:27:30 +02:00
Willy Tarreau	43ba3cf2b5	MEDIUM: proxy: remove start_proxies() Its sole remaining purpose was to display "proxy foo started", which has little benefit and pollutes output for those with plenty of proxies. Let's remove it now. The VTCs were updated to reflect this, because many of them had explicit counts of dropped lines to match this message. This is tagged as MEDIUM because some users may be surprized by the loss of this quite old message.	2020-10-09 11:27:30 +02:00
Willy Tarreau	c3914d4fff	MEDIUM: proxy: replace proxy->state with proxy->disabled The remaining proxy states were only used to distinguish an enabled proxy from a disabled one. Due to the initialization order, both PR_STNEW and PR_STREADY were equivalent after startup, and they would only differ from PR_STSTOPPED when the proxy is disabled or shutdown (which is effectively another way to disable it). Now we just have a "disabled" field which allows to distinguish them. It's becoming obvious that start_proxies() is only used to print a greeting message now, that we'd rather get rid of. Probably that zombify_proxy() and stop_proxy() should be merged once their differences move to the right place.	2020-10-09 11:27:30 +02:00
Willy Tarreau	1ad64acf6c	CLEANUP: peers: don't use the PR_ST* states to mark enabled/disabled The enabled/disabled config options were stored into a "state" field that is an integer but contained only PR_STNEW or PR_STSTOPPED, which is a bit confusing, and causes a dependency with proxies. This was renamed to "disabled" and is used as a boolean. The field was also moved to the end of the struct to stop creating a hole and fill another one.	2020-10-09 11:27:30 +02:00
Willy Tarreau	b50bf046e8	MINOR: startup: don't rely on PR_STNEW to check for listeners Instead of looking at listeners in proxies in PR_STNEW state, we'd rather check for listeners in those not in PR_STSTOPPED as it's only this state which indicates the proxy was disabled. And let's check the listeners count instead of testing the list's head.	2020-10-09 11:27:30 +02:00
Willy Tarreau	f18d968830	MEDIUM: proxy: remove state PR_STPAUSED This state was used to mention that a proxy was in PAUSED state, as opposed to the READY state. This was causing some trouble because if a listener failed to resume (e.g. because its port was temporarily in use during the resume), it was not possible to retry the operation later. Now by checking the number of READY or PAUSED listeners instead, we can accurately know if something went bad and try to fix it again later. The case of the temporary port conflict during resume now works well: $ socat readline /tmp/sock1 prompt > disable frontend testme3 > disable frontend testme3 All sockets are already disabled. > enable frontend testme3 Failed to resume frontend, check logs for precise cause (port conflict?). > enable frontend testme3 > enable frontend testme3 All sockets are already enabled.	2020-10-09 11:27:30 +02:00
Willy Tarreau	a17c91b37f	MEDIUM: proxy: remove the PR_STERROR state This state is only set when a pause() fails but isn't even set when a resume() fails. And we cannot recover from this state. Instead, let's just count remaining ready listeners to decide to emit an error or not. It's more accurate and will better support new attempts if needed.	2020-10-09 11:27:30 +02:00
Willy Tarreau	6b3bf733dd	MEDIUM: proxy: remove the unused PR_STFULL state Since v1.4 or so, it's almost not possible anymore to set this state. The only exception is by using the CLI to change a frontend's maxconn setting below its current usage. This case makes no sense, and for other cases it doesn't make sense either because "full" is a vague concept when only certain listeners are full and not all. Let's just remove this unused state and make it clear that it's not reported. The "ready" or "open" states will continue to be reported without being misleading as they will be opposed to "stop".	2020-10-09 11:27:30 +02:00
Willy Tarreau	efc0eec4c1	MINOR: proxy: maintain per-state counters of listeners The proxy state tries to be synthetic but that doesn't work well with many listeners, especially for transition phases or after a failed pause/resume. In order to address this, we'll instead rely on counters of listeners in a given state for the 3 major states (ready, paused, listen) and a total counter. We'll now be able to determine a proxy's state by comparing these counters only.	2020-10-09 11:27:30 +02:00
Willy Tarreau	a37b244509	MINOR: listeners: introduce listener_set_state() This function is used as a wrapper to set a listener's state everywhere. We'll use it later to maintain some counters in a consistent state when switching state so it's capital that all state changes go through it. No functional change was made beyond calling the wrapper.	2020-10-09 11:27:30 +02:00
Willy Tarreau	bec7ab0ad9	CLEANUP: proxy: remove the first_to_listen hack in zombify_proxy() This thing was needed for an optimization used in soft_stop() which doesn't exist anymore, so let's remove it as it's cryptic and hinders the listeners cleanup.	2020-10-09 11:27:29 +02:00
Willy Tarreau	987dbf5bab	MINOR: listeners: do not uselessly try to close zombie listeners in soft_stop() The loop doesn't match anymore since the non-started listeners are in LI_INIT and even if it had ever worked the benefit of closing zombies at this point looks void at best.	2020-10-09 11:27:29 +02:00
Willy Tarreau	c6dac6c7f5	MEDIUM: listeners: remove the now unused ZOMBIE state The zombie state is not used anymore by the listeners, because in the last two cases where it was tested it couldn't match as it was covered by the test on the process mask. Instead now the FD is either in the LISTEN state or the INIT state. This also avoids forcing the listener to be single-dimensional because actually belonging to another process isn't totally exclusive with the other states, which explains some of the difficulties requiring to check the proc_mask and the fd sometimes. So let's get rid of it now not to be tempted to reuse it. The doc on the listeners state was updated.	2020-10-09 11:27:29 +02:00
Willy Tarreau	ae7bc4a237	MEDIUM: deinit: close all receivers/listeners before scanning proxies Because of the zombie state, proxies have a skewed vision of the state of listeners, which explains why there are hacks switching the state from ZOMBIE to INIT in the proxy cleaning loop. This is particularly complicated and not needed, as all the information is now available in the protocol list and the fdtab. What we do here instead is to first close all active listeners or receivers by protocol and clean their protocol parts. Then we scan the fdtab to get rid of remaining ones that were necessarily in INIT state after a previous invocation of delete_listener(). From this point, we know the listeners are cleaned, the can safely be freed by scanning the proxies.	2020-10-09 11:27:29 +02:00
Willy Tarreau	b6607bfaf0	MEDIUM: listeners: make unbind_listener() converge if needed The ZOMBIE state on listener is a real mess. Listeners passing through this state have lost their consistency with the proxy AND with the fdtab. Plus this state is not used for all foreign listeners, only for those belonging to a proxy that entirely runs on another process, otherwise it stays in INIT state, which makes the usefulness extremely questionable. But the real issue is that it's impossible to untangle the receivers from the proxy state as long as we have this because of deinit()... So what we do here is to start by making unbind_listener() support being called more than once. This will permit to call it again to really close the FD and finish the operations if it's called with an FD that's in a fake state (such as INIT but with a valid fd).	2020-10-09 11:27:29 +02:00
Willy Tarreau	02b092f006	MEDIUM: init: stop disabled proxies after initializing fdtab During the startup process we don't have any fdtab nor fd_updt for quite a long time, and as such some operations on the listeners are not permitted, such as fd_want_/fd_stop_ or fd_delete(). The latter is of particular concern because it's used when stopping a disabled frontend, and it's performed very early during check_config_validity() while there is no fdtab yet. The trick till now relies on the listener's state which is a bit brittle. There is absolutely no valid reason for stopping a proxy's listeners this early, we can postpone it after init_pollers() which will at least have allocated fdtab.	2020-10-09 11:27:29 +02:00
Willy Tarreau	cb89e32f31	MEDIUM: listeners: don't bounce listeners management between queues During 2.1 development, commit `f2cb16948` ("BUG/MAJOR: listener: fix thread safety in resume_listener()") was introduced to bounce the enabling/disabling of a listener's FD to one of its threads because the remains of fd_update_cache() were fundamentally incompatible with the need to call fd_want_recv() or fd_stop_recv() for another thread. However since then we've totally dropped such code and it's totally safe to use these functions on an FD that is solely used by another thread (this is even used by the FD migration code). The only remaining limitation concerning the wake up delay was addressed by previous commit "MEDIUM: fd: always wake up one thread when enabling a foreing FD". The current situation forces the FD management to remain in the pause_listener() and resume_listener() functions just so that it can bounce between threads, without having the ability to delegate it to the suitable protocol layer. So let's first remove this now unneeded workaround.	2020-10-09 11:27:29 +02:00
Willy Tarreau	f015887444	MEDIUM: fd: always wake up one thread when enabling a foreing FD Since 2.2 it's safe to enable/disable another thread's FD but the fd_wake calls will not immediately be considered because nothing wakes the other threads up. This will have an impact on listeners when deciding to resume them after they were paused, so at minima we want to wake up one of their threads, just like the scheduler does on task_kill(). This is what this patch does.	2020-10-09 11:27:29 +02:00
Christopher Faulet	b8d148a93f	BUG/MINOR: http-htx: Expect no body for 204/304 internal HTTP responses 204 and 304 HTTP responses must no contain message body. These status codes are correctly handled when the responses are received from a server. But there is no specific processing for internal HTTP reponses (errorfile and http replies). Now, when errorfiles or an http replies are parsed during the configuration parsing, an error is triggered if a 204/304 message contains a body. An extra check is also performed to ensure the body length matches the announce content-length. This patch should fix the issue #891. It must be backported as far as 2.0. For 2.1 and 2.0, only the http_str_to_htx() function must be fixed. http_parse_http_reply() function does not exist.	2020-10-09 10:02:09 +02:00
Christopher Faulet	5563392554	BUG/MINOR: http: Fix content-length of the default 500 error 96 bytes is announce in the C-L header for a message of body of 97 bytes. This bug was introduced by the patch `46a030cdd` ("CLEANUP: assorted typo fixes in the code and comments"). This patch must be backported in all versions where the patch above is (the 2.2 for now).	2020-10-09 10:02:09 +02:00
Christopher Faulet	aade4edc1a	BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams This patch is similar to the previous one on the fcgi. Same is true for the H2. But the bug is far harder to trigger because of the protocol cinematic. But it may explain strange aborts in some edge cases. A read0 received on the connection must not be handled too early by H2 streams. If the demux buffer is not empty, the pending read0 must not be considered. The H2 streams must not be passed in half-closed remote state in h2s_wake_one_stream() and the CS_FL_EOS flag must not be set on the associated conn-stream in h2_rcv_buf(). To sum up, it means, if there are still data pending in the demux buffer, no abort must be reported to the streams. To fix the issue, a dedicated function has been added, responsible for detecting pending read0 for a H2 connection. A read0 is reported only if the demux buffer is empty. This function is used instead of conn_xprt_read0_pending() at some places. Note that the HREM stream state should not be used to report aborts. It is performed on h2s_wake_one_stream() function and it is a legacy of the very first versions of the mux-h2. This patch should be backported as far as 2.0. In the 1.8, the code is too different to apply it like that. But it is probably useless because the mux-h2 can only be installed on the client side.	2020-10-09 10:02:09 +02:00
Christopher Faulet	6670e3e2bf	BUG/MEDIUM: mux-fcgi: Don't handle pending read0 too early on streams A read0 received on the connection must not be handled too early by FCGI streams. If the demux buffer is not empty, the pending read0 must not be considered. The FCGI streams must not be passed in half-closed remote state in fcgi_strm_wake_one_stream() and the CS_FL_EOS flag must not be set on the associated conn-stream in fcgi_rcv_buf(). To sum up, it means, if there are still data pending in the demux buffer, no abort must be reported to the streams. To fix the issue, a dedicated function has been added, responsible for detecting pending read0 for a FCGI connection. A read0 is reported only if the demux buffer is empty. This function is used instead of conn_xprt_read0_pending() at some places. This patch should fix the issue #886. It must be backported as far as 2.1.	2020-10-09 10:02:00 +02:00
Emeric Brun	b0c331f71f	BUG/MINOR: proxy/log: frontend/backend and log forward names must differ This patch disallow to use same name for a log forward section and a frontend/backend section.	2020-10-08 08:53:26 +02:00
Emeric Brun	cbb7bf7dd1	MEDIUM: log: syslog TCP support on log forward section. This patch re-introduce the "bind" statement on log forward sections to handle syslog TCP listeners as defined in rfc-6587. As complement it introduce "maxconn", "backlog" and "timeout client" statements to parameter those listeners.	2020-10-07 17:17:27 +02:00
Emeric Brun	6d75616951	MINOR: channel: new getword and getchar functions on channel. This patch adds two new functions to get a char or a word from a channel.	2020-10-07 17:17:27 +02:00
Emeric Brun	2897644ae5	MINOR: stats: inc req counter on listeners. This patch enables count of requests for listeners if listener's counters are enabled.	2020-10-07 17:17:27 +02:00
Emeric Brun	c47ba59d1e	BUG/MEDIUM: log: old processes with log foward section don't die on soft stop. Old processes didn't die if a log foward section is declared and a soft stop is requested. This patch fix this issue and should be backpored in banches including the log forward feature.	2020-10-07 17:17:27 +02:00
Emeric Brun	a39ecbdac1	BUG/MINOR: proxy: inc req counter on new syslog messages. Increase req counter instead of conn counter on new syslog messages. This should be backported on branches including the syslog forward feature.	2020-10-07 17:17:27 +02:00
Christopher Faulet	9589aa0fe5	CLEANUP: sock-unix: Remove an unreachable goto clause Coverity reported dead code in sock_unix_bind_receiver() function. A goto clause is unreachable because of the preceeding if/else block. This patch should fix the issue #865. No backport needed.	2020-10-07 14:37:03 +02:00
Christopher Faulet	7b06d3adaa	MINOR: mux-h1: Don't wakeup the H1C when output buffer become available There is no reason to wake up the H1 connection when a new output buffer is retrieved after an allocation failure because only the H1 stream will fill it.	2020-10-07 14:07:29 +02:00
Christopher Faulet	e9da975aab	BUG/MINOR: mux-h1: Always set the session on frontend h1 stream The session is always defined for a frontend connection. When a new client connection is established, the session is set for the first H1 stream. But on keep-alived connections, it is not set for the followings H1 streams while it is possible. This patch is tagged as a bug because it fixes an inconsistency in the H1 streams creation. But it does not fixed a known bug. This patch must be backported as far as 2.0.	2020-10-07 14:07:29 +02:00
Christopher Faulet	69f2cb8df3	BUG/MINOR: mux-h1: Be sure to only set CO_RFL_READ_ONCE for the first read The condition to set CO_RFL_READ_ONCE flag is not really accurate. We must check the request state on frontend connection only and, in the opposite, the response state on backend connection only. Only the parsed side must be considered, not the opposite one. This patch must be backported to 2.2.	2020-10-07 14:07:29 +02:00
Christopher Faulet	58feb49ed2	CLEANUP: ssl: Release cached SSL sessions on deinit On deinit, when the server SSL ctx is released, we must take care to release the cached SSL sessions stored in the array <ssl_ctx.reused_sess>. There are global.nbthread entries in this array, each one may have a pointer on a cached session. This patch should fix the issue #802. No backport needed.	2020-10-07 14:07:29 +02:00
Tim Duesterhus	d7c6e6a71d	CLEANUP: cache: Fix leak of cconf->c.name during config check During the config check, the post parsing is not performed. Thus, cache filters are not fully initialized and their cache name are never released. To be able to release them, a flag is now set when a cache filter is fully initialized. On deinit, if the flag is not set, it means the cache name must be freed. The patch should fix #849. No backport needed. [Cf: Tim is the patch author, but I added the commit message]	2020-10-07 14:07:29 +02:00
Christopher Faulet	a10000305f	BUG/MINOR: proto_tcp: Report warning messages when listeners are bound When a TCP listener is bound, in the tcp_bind_listener() function, a warning message may be reported and should be displayed on verbose mode. But the warning message is actually lost if the socket is successfully bound because we don't fill the <errmsg> variable in this case. This patch should fix the issue #863. No backport is needed.	2020-10-07 14:07:16 +02:00
Fr�d�ric L�caille	e7e2b21d27	BUG/MINOR: peers: Inconsistency when dumping peer status codes. A peer connection status must be considered as valid only if there is an applet which has been instantiated for the connection to the peer. So, ->statuscode should be considered as the last known peer connection status from the last connection to this peer if any. To reflect this, "statuscode" field of peer dump is renamed to "last_statuscode". This patch also add "active"/"inactive" field after the peer location type ("remote" or "local") if an applet has been instantiated for this peer connection or not. Thank you to Emeric for having noticed this issue. Must be backported in >=1.9 version.	2020-10-07 07:27:01 +02:00
Amaury Denoyelle	27373f7f75	MINOR: stats: remove for loop declaration Remove variable declaration inside a for-loop. This was introduced by my patches serie of the implementation of dynamic stats. This is not supported by older gcc, notably on the freebsd environment of the ci.	2020-10-05 17:55:40 +02:00
Amaury Denoyelle	fbd0bc98fe	MINOR: dns/stats: integrate dns counters in stats Use the new stats module API to integrate the dns counters in the standard stats. This is done in order to avoid code duplication, keep the code related to cli out of dns and use the full possibility of the stats function, allowing to print dns stats in csv or json format.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	0b70a8a314	MINOR: stats: add config "stats show modules" By default, hide the extra statistics on the html page. Define a new flag STAT_SHMODULES which is activated if the config "stats show modules" is set.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	e3f576c29e	MINOR: stats: display extra proxy stats on the html page Integrate the additional proxy stats on the html stats page. For each module, a new column is displayed with the individual stats available as a tooltip.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	d3700a7fda	MINOR: stats: support clear counters for dynamic stats Add a boolean 'clearable' on stats module structure. If set, it forces all the counters to be reset on 'clear counters' cli command. If not, the counters are reset only when 'clear counters all' is used.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	ee63d4bd67	MEDIUM: stats: integrate static proxies stats in new stats This is executed on startup with the registered statistics module. The existing statistics have been merged in a list containing all statistics for each domain. This is useful to print all available statistics in a generic way. Allocate extra counters for all proxies/servers/listeners instances. These counters are allocated with the counters from the stats modules registered on startup.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	58d395e0d6	MEDIUM: stats: define an API to register stat modules A stat module can be registered to quickly add new statistics on haproxy. It must be attached to one of the available stats domain. The register must be done using INITCALL on STG_REGISTER. The stat module has a name which should be unique for each new module in a domain. It also contains a statistics list with their name/desc and a pointer to a function used to fill the stats from the module counters. The module also provides the initial counters values used on automatically allocated counters. The offset for these counters are stored in the module structure.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	50660a894d	MEDIUM: stats: add delimiter for static proxy stats on csv Use the character '-' to mark the end of static statistics on proxy domain. After this marker, the order of the fields is not guaranteed and should be parsed with care.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	72b16e5173	MINOR: stats: define additional flag px cap on domain This flag can be used to determine on what type of proxy object the statistics should be relevant. It will be useful when adding dynamic statistics. Currently, this flag is not used.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	072f97eddf	MINOR: stats: define the concept of domain for statistics The domain option will be used to have statistics attached to other objects than proxies/listeners/servers. At the moment, only the PROXY domain is available. Add an argument 'domain' on the 'show stats' cli command to specify the domain. Only 'domain proxy' is available now. If not specified, proxy will be considered the default domain. For HTML output, only proxy statistics will be displayed.	2020-10-05 12:02:14 +02:00
Christopher Faulet	f98d821b94	MINOR: hlua: Display debug messages on stderr only in debug mode Debug Messages emitted in lua using core.Debug() or core.log() are now only displayed on stderr if HAProxy is started in debug mode (-d parameter on the command line). There is no change for other message levels. This patch should fix the issue #879. It may be backported to all stable versions.	2020-10-05 11:11:36 +02:00
Amaury Denoyelle	98b81cb393	REORG: stats: extract proxies dump loop in a function Create a dedicated function to loop on proxies and dump them. This will be clearer when other object will be dump as well. This patch is needed to extend stat support to components other than proxies objects.	2020-10-05 10:54:35 +02:00
Amaury Denoyelle	f34017bb74	REORG: stats: extract proxy json dump Create a dedicated function to dump a proxy as a json content. This patch will be needed when other types of objects will be available for json dump. This patch is needed to extend stat support to components other than proxies objects.	2020-10-05 10:53:50 +02:00
Amaury Denoyelle	da5b6d1cd9	MINOR: stats: hide px/sv/li fields in applet struct Use an opaque pointer to store proxy instance. Regroup server/listener as a single opaque pointer. This has the benefit to render the structure more evolutive to support statistics on other types of objects in the future. This patch is needed to extend stat support for components other than proxies objects. The prometheus module has been adapted for these changes.	2020-10-05 10:48:58 +02:00
Amaury Denoyelle	97323c9ed4	MINOR: stats: add stats size as a parameter for csv/json dump Render the stats size parametric in csv/json dump functions. This is needed for the future patch which provides dynamic stats. For now the static value ST_F_TOTAL_FIELDS is provided. Remove unused parameter px on stats_dump_one_line. This patch is needed to extend stat support to components other than proxies objects.	2020-10-05 09:06:10 +02:00
Amaury Denoyelle	3ca927e68f	REORG: stats: export some functions Un-mark stats_dump_one_line and stats_putchk as static and export them in the header file. These functions will be reusable by other components to print their statistics. This patch is needed to extend stat support to components other than proxies objects.	2020-10-05 09:06:10 +02:00
Amaury Denoyelle	a53ce4cc01	BUG/MINOR: stats: fix validity of the json schema The json schema seems to be invalid when checking using the validator from https://www.jsonschemavalidator.net/. Correct it using the following specification : http://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.9.1 The impact of the bug it not well known as I am not sure of how useful the json schema is for users. It is probably not used at all or else this bug would have been reported. This should be backported up to 1.8.	2020-10-05 09:06:06 +02:00
William Lallemand	51f784bcf9	CLEANUP: ssl: "bundle" is not an OpenSSL wording There is a confusion between the HAProxy bundle and OpenSSL. OpenSSL does not have "bundles" but multiple certificates in the same store. Fix a commentary in the crt-list code.	2020-10-02 18:11:47 +02:00
Christopher Faulet	f7177271f3	BUG/MINOR: tcpcheck: Set socks4 and send-proxy flags before the connect call Since the health-check refactoring in the 2.2, the checks through a socks4 proxy are broken. To fix this bug, CO_FL_SOCKS4 flag must be set on the connection before calling the connect() callback function because this flags is checked to use the right destination address. The same is done for the CO_FL_SEND_PROXY flag for a consistency purpose. A reg-test has been added to test the "check-via-socks4" directive. This patch must be backported to 2.2.	2020-10-02 17:14:34 +02:00
Christopher Faulet	2079a4ad36	MEDIUM: tcp-rules: Warn if a track-sc* content rule doesn't depend on content The warning is only emitted for HTTP frontend. Idea is to encourage the usage of "tcp-request session" rules to track counters that does not depend on the request content. The documentation has been updated accordingly. The warning is important because since the multiplexers were added in the processing chain, the HTTP parsing is performed at a lower level. Thus parsing errors are detected in the multiplexers, before the stream creation. In HTTP/2, the error is reported by the multiplexer itself and the stream is never created. This difference has a certain number of consequences, one of which is that HTTP request counting in stick tables only works for valid H2 request, and HTTP error tracking in stick tables never considers invalid H2 requests but only invalid H1 ones. And the aim is to do the same with the mux-h1. This change will not be done for the 2.3, but the 2.4. At the end, H1 and H2 parsing errors will be caught by the multiplexers, at the session level. Thus, tracking counters at the content level should be reserved for rules using a key based on the request content or those using ACLs based on the request content. To be clear, a warning will be emitted for the following rules : tcp-request content track-sc0 src tcp-request content track-sc0 src if ! { src 10.0.0.0/24 } tcp-request content track-sc0 src if { ssl_fc } But not for the following ones : tcp-request content track-sc0 req.hdr(host) tcp-request content track-sc0 src if { req.hdr(host) -m found }	2020-10-02 15:50:26 +02:00
Eric Salama	7cea6065ac	BUG/MINOR: Fix several leaks of 'log_tag' in init(). We use chunk_initstr() to store the program name as the default log-tag. If we use the log-tag directive in the config file, this chunk will be destroyed and replaced. chunk_initstr() sets the chunk size to 0 so we will free the chunk itself, but not its content. This happens for a global section and also for a proxy. We fix this by using chunk_initlen() instead of chunk_initstr(). We also check that the memory allocation was successfull, otherwise we quit. This fixes github issue #850. It can be backported as far as 1.9, with minor adjustments to includes.	2020-10-02 15:50:26 +02:00
William Dauchy	1d0206e71f	MINOR: ssl: remove uneeded check in crtlist_parse_file this condition is never true as we either break or goto error, so those two lines could be removed in the current state of the code. this is fixing github issue #862 Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-10-02 15:43:01 +02:00
Tim Duesterhus	b9f6accc9e	MINOR: ssl: Add error if a crt-list might be truncated Similar to warning during the parsing of the regular configuration file that was added in `2fd5bdb439` this patch adds a warning to the parsing of a crt-list if the file does not end in a newline (and thus might have been truncated). The logic essentially just was copied over. It might be good to refactor this in the future, allowing easy re-use within all line-based config parsers. see https://github.com/haproxy/haproxy/issues/860#issuecomment-693422936 see `0354b658f0` This should be backported as a warning to 2.2.	2020-10-02 12:29:03 +02:00
Tim Duesterhus	6d07fae3c0	CLEANUP: ssl: Use structured format for error line report during crt-list parsing This reuses the known `parsing [%s:%d]:` from regular config file error reporting.	2020-10-02 12:29:03 +02:00
Willy Tarreau	fe2cc41151	BUILD: tools: fix minor build issue on isspace() Previous commit `fa41cb679` ("MINOR: tools: support for word expansion of environment in parse_line") introduced two new isspace() on a char and broke the build on systems using an array disguised in a macro instead of a function (like cygwin). Just use the usual cast.	2020-10-01 18:05:48 +02:00
Amaury Denoyelle	fa41cb6792	MINOR: tools: support for word expansion of environment in parse_line Allow the syntax "${...[*]}" to expand an environment variable containing several values separated by spaces as individual arguments. A new flag PARSE_OPT_WORD_EXPAND has been added to toggle this feature on parse_line invocation. In case of an invalid syntax, a new error PARSE_ERR_WRONG_EXPAND will be triggered. This feature has been asked on the github issue #165.	2020-10-01 17:24:14 +02:00
Willy Tarreau	82cd5c13a5	OPTIM: backend: skip LB when we know the backend is full For some algos (roundrobin, static-rr, leastconn, first) we know that if there is any request queued in the backend, it's because a previous attempt failed at finding a suitable server after trying all of them. This alone is sufficient to decide that the next request will skip the LB algo and directly reach the backend's queue. Doing this alone avoids an O(N) lookup when load-balancing on a saturated farm of N servers, which starts to be very expensive for hundreds of servers, especially under the lbprm lock. This change alone has increased the request rate from 110k to 148k RPS for 200 saturated servers on 8 threads, and fwlc_reposition_srv() doesn't show up anymore in perf top. See github issue #880 for more context. It could have been the same for random, except that random is performed using a consistent hash and it only considers a small set of servers (2 by default), so it may result in queueing at the backend despite having some free slots on unknown servers. It's no big deal though since random() only performs two attempts by default. For hashing algorithms this is pointless since we don't queue at the backend, except when there's no hash key found, which is the least of our concerns here.	2020-09-29 17:18:37 +02:00
Willy Tarreau	b88ae18021	OPTIM: backend/random: never queue on the server, always on the backend If random() returns a server whose maxconn is reached or the queue is used, instead of adding the request to the server's queue, better add it to the backend queue so that it can be served by any server (hence the fastest one).	2020-09-29 17:18:11 +02:00
William Lallemand	20b0fed28c	BUG/MINOR: ssl/crt-list: exit on warning out of crtlist_parse_line() We should not exits on error out of the crtlist_parse_line() function. The cfgerr error must be checked with the ERR_CODE mask. Must be backported in 2.2.	2020-09-28 15:48:54 +02:00
Miroslav Zagorac	a6aca669b5	BUILD: trace: include tools.h If the TRACE option is used when compiling the haproxy source, the following error occurs on debian 9.13: src/calltrace.o: In function `make_line': .../src/calltrace.c:204: undefined reference to `rdtsc' src/calltrace.o: In function `calltrace': .../src/calltrace.c:277: undefined reference to `rdtsc' collect2: error: ld returned 1 exit status Makefile:866: recipe for target 'haproxy' failed	2020-09-25 17:54:48 +02:00
Willy Tarreau	82cd028d71	BUG/MINOR: listeners: properly close listener FDs The code dealing with zombie proxies in soft_stop() is bogus, it uses close() instead of fd_delete(), leaving a live entry in the fdtab with a dangling pointer to a free memory location. The FD might be reassigned for an outgoing connection for the time it takes the proxy to completely stop, or could be dumped on the CLI's "show fd" command. In addition, the listener's FD was not even reset, leaving doubts about whether or not it will happen again in deinit(). And in deinit(), the loop in charge of closing zombie FDs is particularly unsafe because it closes the fd then calls unbind_listener() then delete_listener() hoping none of them will touch it again. Since it requires some mental efforts to figure what's done there, let's correctly reset the fd here as well and close it using fd_delete() to eliminate any remaining doubts. It's uncertain whether this should be backported. Zombie proxies are rare and the situations capable of triggering such issues are not trivial to setup. However it's easy to imagine how things could go wrong if backported too far. Better wait for any matching report if at all (this code has been there since 1.8 without anobody noticing).	2020-09-25 13:46:47 +02:00
Willy Tarreau	02e1975c29	BUG/MEDIUM: listeners: do not pause foreign listeners There's a nasty case with listeners that belong to foreign processes. If a proxy is defined this way: global nbproc 2 frontend f bind :1111 process 1 bind :2222 process 2 and if stats expose-fd listeners is set, the listeners' FDs will not be closed on the processes that don't use them. At this point it's not a big deal, except that they're shared between processes and that a "disable frontend f" issued on one process will pause all of them and cause the other process to see accept() fail, turning its own listener to state LI_LIMITED to try to leave it some time to recover. But it will never recover, even after an enable. The root cause of the issue is that the ZOMBIE state doesn't cover this situation since it's only for a proxy being entirely bound to a process. What we do here to address this is that we refrain from pausing a file descriptor that belongs to a foreign process in pause_listener(). This definitely solves the problem. A similar test is present in resume_listener() and is the reason why the FD doesn't recover upon the "enable" action by the way. This ought to be backported to 1.8 where seamless reload was integrated. The config above should be sufficient to validate that the fix works; after a pair of "disable/enable frontend" no process will handle the traffic to one of the ports anymore.	2020-09-25 13:46:47 +02:00
Willy Tarreau	57a374131c	MINOR: backend: add a new "path-only" option to "balance uri" Since we've fixed the way URIs are handled in 2.1, some users have started to experience inconsistencies in "balance uri" between requests received over H1 and the same ones received over H2. This is caused by the fact that H1 rarely uses absolute URIs while H2 always uses them. Similar issues were reported already around replace-uri etc, leading to "pathq" recently being introduced, so this isn't new. Here what this patch does is add a new option to "balance uri" to indicate that the hashing should only start at the path and not cover the authority. This makes H1 relative URIs and H2 absolute URI hashes equally again. Some extra options could be added to normalize URIs by always hashing the authority (or host) in front of them, which would make sure that both absolute and relative requests provide the same hash. This is left for later if needed.	2020-09-23 08:56:29 +02:00
Willy Tarreau	3d1119d225	MINOR: backend: make the "whole" option of balance uri take only one bit We'll want to add other boolean options on "balance uri", so let's make some room aside "whole" and make it take only one bit and not one int.	2020-09-23 08:05:47 +02:00
Amaury Denoyelle	36b536652f	BUG/MINOR: config: Fix memory leak on config parse listen This memory leak happens if there is two or more defaults section. When the default proxy is reinitialized, the structure member containing the config filename must be freed. Fix github issue #851. Should be backported as far as 1.6.	2020-09-18 16:17:09 +02:00
Eric Salama	1aab911017	BUG/MINOR: Fix memory leaks cfg_parse_peers When memory allocation fails in cfg_parse_peers or when an error occurs while parsing a stick-table, the temporary table and its id must be freed. This fixes github issue #854. It should be backported as far as 2.0.	2020-09-18 12:06:08 +02:00
Christopher Faulet	d2414a23c4	BUG/MINOR: http-fetch: Don't set the sample type during the htx prefetch A subtle bug was introduced by the commit `a6d9879e6` ("BUG/MEDIUM: htx: smp_prefetch_htx() must always validate the direction"), for the "method" sample fetch only. The sample data type and the method id are always overwritten because smp_prefetch_htx() function is called later in the sample fetch evaluation. The bug is in the smp_prefetch_htx() function but it is only visible for the "method" sample fetch, for an unknown method. In fact, when smp_prefetch_htx() is called, the sample object is altered. The data type is set to SMP_T_BOOL and, on success, the data value is set to 1. Thus, if the caller has already set some infos into the sample object, they may be lost. AFAIK, there is no reason to do so. It is inherited from the legacy HTTP code and I honestely don't known why it was done this way. So, instead of fixing the "method" sample fetch to set useful info after the call to smp_prefetch_htx() function, I prefer to not alter the sample object in smp_prefetch_htx(). This patch must be backported as far as 2.0. On the 2.0, only the HTX part must be fixed.	2020-09-18 11:06:24 +02:00
Willy Tarreau	bba7a4dafd	BUG/MINOR: h2/trace: do not display "stream error" after a frame ACK When sending a frame ACK, the parser state is not equal to H2_CS_FRAME_H and we used to report it as an error, which is not true. In fact we should only indicate when we skip remaining data. This may be backported as far as 2.1.	2020-09-18 07:41:28 +02:00
Willy Tarreau	8520d87198	MINOR: h2/trace: also display the remaining frame length in traces It's often missing when debugging, even though it's often zero for control frames or after data are consumed.	2020-09-18 07:39:29 +02:00
Willy Tarreau	f2cda10b1d	BUILD: sock_inet: include errno.h I was careful to have it for sock_unix.c but missed it for sock_inet which broke with commit `36722d227` ("MINOR: sock_inet: report the errno string in binding errors") depending on the build options. No backport is needed.	2020-09-17 14:02:01 +02:00
Willy Tarreau	3cd58bf805	MINOR: sock_unix: report the errno string in binding errors Just like with previous patch, let's report UNIX socket binding errors in plain text. we can now see for example: [ALERT] 260/083531 (13365) : Starting frontend f: cannot switch final and temporary UNIX sockets (Operation not permitted) [/tmp/root.sock] [ALERT] 260/083640 (13375) : Starting frontend f: cannot change UNIX socket ownership (Operation not permitted) [/tmp/root.sock]	2020-09-17 08:35:38 +02:00
Willy Tarreau	36722d2274	MINOR: sock_inet: report the errno string in binding errors With the socket binding code cleanup it becomes easy to add more info to error messages. One missing thing used to be the error string, which is now added after the generic one, for example: [ALERT] 260/082852 (12974) : Starting frontend f: cannot bind socket (Permission denied) [0.0.0.0:4] [ALERT] 260/083053 (13292) : Starting frontend f: cannot bind socket (Address already in use) [0.0.0.0:4444] [ALERT] 260/083104 (13298) : Starting frontend f: cannot bind socket (Cannot assign requested address) [1.1.1.1:4444]	2020-09-17 08:32:17 +02:00
Willy Tarreau	eb8cfe6723	BUILD: sock_unix: add missing errno.h It builds fine when openssl is enabled, but fails otherwise. No backport is needed.	2020-09-16 22:15:40 +02:00
Willy Tarreau	af9609b4d1	MINOR: tools: drop listener detection hack from str2sa_range() We used to resort to a trick to detect whether the caller was a listener or an outgoing socket in order never to present an AF_CUST_UDP* socket to a log server nor a nameserver. This is no longer necessary, the socket type alone will be enough.	2020-09-16 22:08:08 +02:00
Willy Tarreau	2b5e0d8b6a	MEDIUM: proto_udp: replace last AF_CUST_UDP* with AF_INET* We don't need to cheat with the sock_domain anymore, we now always have the SOCK_DGRAM sock_type as a complementary selector. This patch restores the sock_domain to AF_INET* in the udp* protocols and removes all traces of the now unused AF_CUST_*.	2020-09-16 22:08:08 +02:00
Willy Tarreau	b2ffc99bbd	MEDIUM: tools: make str2sa_range() use protocol_lookup() By doing so we can remove the hard-coded mapping from AF_INET to AF_CUST_UDP but we still need to keep the test on the listeners as long as these dummy families remain present in the code.	2020-09-16 22:08:08 +02:00
Willy Tarreau	910c64da96	MEDIUM: protocol: store the socket and control type in the protocol array The protocol array used to be only indexed by socket family, which is very problematic with UDP (requiring an extra family) and with the forthcoming QUIC (also requiring an extra family), especially since that binds them to certain families, prevents them from supporting dgram UNIX sockets etc. In order to address this, we now start to register the protocols with more info, namely the socket type and the control type (either stream or dgram). This is sufficient for the protocols we have to deal with, but could also be extended further if multiple protocol variants were needed. But as is, it still fits nicely in an array, which is convenient for lookups that are instant.	2020-09-16 22:08:08 +02:00
Willy Tarreau	a54553f74f	MINOR: protocol: add the control layer type in the protocol struct This one will be needed to more accurately select a protocol. It may differ from the socket type for QUIC, which uses dgram at the socket layer and provides stream at the control layer. The upper level requests a control layer only so we need this field.	2020-09-16 22:08:08 +02:00
Willy Tarreau	65ec4e3ff7	MEDIUM: tools: make str2sa_range() check that the protocol has ->connect() Most callers of str2sa_range() need the protocol only to check that it provides a ->connect() method. It used to be used to verify that it's a stream protocol, but it might be a bit early to get rid of it. Let's keep the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT is present. This way almost all call places could be cleaned from this. There's a strange test in the server address parsing code that rechecks the family from the socket which seems to be a duplicate of the previously removed tests. It will have to be rechecked.	2020-09-16 22:08:08 +02:00
Willy Tarreau	5fc9328aa2	MINOR: tools: make str2sa_range() directly return the protocol We'll need this so that it can return pointers to stacked protocol in the future (for QUIC). In addition this removes a lot of tests for protocol validity in the callers. Some of them were checked further apart, or after a call to str2listener() and they were simplified as well. There's still a trick, we can fail to return a protocol in case the caller accepts an fqdn for use later. This is what servers do and in this case it is valid to return no protocol. A typical example is: server foo localhost:1111	2020-09-16 22:08:08 +02:00
Willy Tarreau	9b3178df23	MINOR: listener: pass the chosen protocol to create_listeners() The function will need to use more than just a family, let's pass it the selected protocol. The caller will then be able to do all the fancy stuff required to pick the best protocol.	2020-09-16 22:08:08 +02:00
Willy Tarreau	5e1779abbf	MEDIUM: config: make str2listener() not accept datagram sockets anymore str2listener() was temporarily hacked to support datagram sockets for the log-forward listeners. This has has an undesirable side effect that "bind udp@1.2.3.4:5555" was silently accepted as TCP for a bind line. We don't need this hack anymore since the only user (log-forward) now relies on str2receiver(). Now such an address will properly be rejected.	2020-09-16 22:08:08 +02:00
Willy Tarreau	26ff5dabc0	MINOR: log-forward: use str2receiver() to parse the dgram-bind address Thanks to this we don't need to specify "udp@" as it's implicitly a datagram type listener that is expected, so any AF_INET/AF_INET4 address will work.	2020-09-16 22:08:08 +02:00
Willy Tarreau	aa333123f2	MINOR: cfgparse: add str2receiver() to parse dgram receivers This is at least temporary, as the migration at once is way too difficuly. For now it still creates listeners but only allows DGRAM sockets. This aims at easing the split between listeners and receivers.	2020-09-16 22:08:08 +02:00
Willy Tarreau	62a976cd44	MINOR: tools: remove the central test for "udp" in str2sa_range() Now we only rely on dgram type associated with AF_INET/AF_INET6 to infer UDP4/UDP6. We still keep the hint based on PA_O_SOCKET_FD to detect that the caller is a listener though. It's still far from optimal but UDP remains rooted into the protocols and needs to be taken out first.	2020-09-16 22:08:08 +02:00
Willy Tarreau	3baec249b1	MEDIUM: tools: make str2sa_range() only report AF_CUST_UDP on listeners For now only listeners can make use of AF_CUST_UDP and it requires hacks in the DNS and logsrv code to remap it to AF_INET. Make str2sa_range() smarter by detecting that it's called for a listener and only set these protocol families for listeners. This way we can get rid of the hacks.	2020-09-16 22:08:08 +02:00
Willy Tarreau	e835bd8f91	MINOR: tools: start to distinguish stream and dgram in str2sa_range() The parser now supports a socket type for the control layer and a possible other one for the transport layer. Usually they are the same except for protocols like QUIC which will provide a stream transport layer based on a datagram control layer. The default types are preset based on the caller's expectations, and may be refined using "stream+" and "dgram+" prefixes. For now they were not added to the docuemntation because other changes will probably happen around UDP as well. It is conceivable that "tcpv4@" or "udpv6@" will appear later as aliases for "stream+ipv4" or "dgram+ipv6".	2020-09-16 22:08:08 +02:00
Willy Tarreau	a215be282d	MEDIUM: tools: make str2sa_range() check for the sockpair's FD usability Just like for inherited sockets, we want to make sure that FDs that are mentioned in "sockpair@" are actually usable. Right now this test is performed by the callers, but not everywhere. Typically, the following config will fail if fd #5 is not bound: frontend bind sockpair@5 But this one will pass if fd #6 is not bound: backend server s1 sockpair@6 Now both will return an error in such a case: - 'bind' : cannot use file descriptor '5' : Bad file descriptor. - 'server s1' : cannot use file descriptor '6' : Bad file descriptor. As such the test in str2listener() is not needed anymore (and it was wrong by the way, as it used to test for the socket by overwriting the local address with a new address that's made of the FD encoded on 16 bits and happens to still be at the same place, but that strictly depends on whatever the kernel wants to put there).	2020-09-16 22:08:08 +02:00
Willy Tarreau	804f11fdf8	MINOR: config: do not test an inherited socket again Since previous patch we know that a successfully bound fd@XXX socket is returned as its own protocol family from str2sa_range() and not as AF_CUST_EXISTING_FD anymore o we don't need to check for that case in str2listener().	2020-09-16 22:08:08 +02:00
Willy Tarreau	6edc722093	MEDIUM: tools: make str2sa_range() resolve pre-bound listeners When str2sa_range() is invoked for a bind or log line, and it gets a file descriptor number, it will immediately resolve the socket's address (when it's a socket) so that the address family, address and port are correctly set. This will later allow to resolve some transport protocols that are attached to existing FDs. For raw FDs (e.g. logs) and for socket pairs, the FD number is still returned in the address, because we need the underlying address management to complete the bind/listen/connect/whatever needed. One immediate benefit is that passing a bad FD will now result in one of these errors: 'bind' : cannot use file descriptor '3' : Socket operation on non-socket. 'bind' : socket on file descriptor '3' is of the wrong type. Note that as of now, we never return a listening socket with a family of AF_CUST_EXISTING_FD. The only case where this family is seen is for a raw FD (e.g. logs).	2020-09-16 22:08:08 +02:00
Willy Tarreau	895992619d	MINOR: log: detect LOG_TARGET_FD from the fd and not from the syntax Now that we have the FD value reported we don't need to cheat and detect "fd@" in the address, we can safely rely on the FD value.	2020-09-16 22:08:08 +02:00
Willy Tarreau	a93e5c7fae	MINOR: tools: make str2sa_range() optionally return the fd If a file descriptor was passed, we can optionally return it. This will be useful for listening sockets which are both a pre-bound FD and a ready socket.	2020-09-16 22:08:08 +02:00
Willy Tarreau	909c23b086	MINOR: listener: remove the inherited arg to create_listener() This argument can now safely be determined from fd != -1, let's just drop it.	2020-09-16 22:08:08 +02:00
Willy Tarreau	328199348b	MINOR: tools: add several PA_O_* flags in str2sa_range() callers These flags indicate whether the call is made to fill a bind or a server line, or even just send/recv calls (like logs or dns). Some special cases are made for outgoing FDs (e.g. pipes for logs) or socket FDs (e.g external listeners), and there's a distinction between stream or dgram usage that's expected to significantly help str2sa_range() proceed appropriately with the input information. For now they are not used yet.	2020-09-16 22:08:08 +02:00
Willy Tarreau	8b0fa8f0ab	MEDIUM: config: remove all checks for missing/invalid ports/ranges Now that str2sa_range() checks for appropriate port specification, we don't need to implement adhoc test cases in every call place, if the result is valid, the conditions are met otherwise the error message is appropriately filled.	2020-09-16 22:08:08 +02:00
Willy Tarreau	7f96a8474c	MEDIUM: tools: make str2sa_range() validate callers' port specifications Now str2sa_range() will enforce the caller's port specification passed using the PA_O_PORT_* flags, and will return an error on failure. For optional ports, values 0-65535 will be enforced. For mandatory ports, values 1-65535 are enforced. In case of ranges, it is also verified that the upper bound is not lower than the lower bound, as this used to result in empty listeners. I couldn't find an easy way to test this using VTC since the purpose is to trigger parse errors, so instead a test file is provided as tests/ports.cfg with comments about what errors are expected for each line.	2020-09-16 22:08:08 +02:00
Willy Tarreau	809587635e	MINOR: tools: add several PA_O_PORT_* flags in str2sa_range() callers These flags indicate what is expected regarding port specifications. Some callers accept none, some need fixed ports, some have it mandatory, some support ranges, and some take an offset. Each possibilty is reflected by an option. For now they are not exploited, but the goal is to instrument str2sa_range() to properly parse that.	2020-09-16 22:08:07 +02:00
Willy Tarreau	cd3a5591f6	MINOR: tools: make str2sa_range() take more options than just resolve We currently have an argument to require that the address is resolved but we'll soon add more, so let's turn it into a bit field. The old "resolve" boolean is now PA_O_RESOLVE.	2020-09-16 22:08:07 +02:00
Willy Tarreau	5a7beed67b	CLEANUP: tools: make str2sa_range() less awful for fd@ and sockpair@ The code is built to match prefixes at one place and to parse the address as a second step, except for fd@ and sockpair@ where the test first passes via AF_UNSPEC that is changed again. This is ugly and confusing, so let's proceed like for the other ones.	2020-09-16 22:08:07 +02:00
Willy Tarreau	a5b325f92c	MINOR: protocol: add a real family for existing FDs At some places (log fd@XXX, bind fd@XXX) we support using an explicit file descriptor number, that is placed into the sockaddr for later use. The problem is that till now it was done with an AF_UNSPEC family, which is also used for other situations like missing info or rings (for logs). Let's create an "official" family AF_CUST_EXISTING_FD for this case so that we are certain the FD can be found in the address when it is set.	2020-09-16 22:08:07 +02:00
Willy Tarreau	1e984b73f0	CLEANUP: protocol: remove family-specific fields from struct protocol This removes the following fields from struct protocol that are now retrieved from the protocol family instead: .sock_family, .sock_addrlen, .l3_addrlen, .addrcmp, .bind, .get_src, .get_dst. This also removes the UDP-specific udp{,6}_get_{src,dst}() functions which were referenced but not used yet. Their goal was only to remap the original AF_INET* addresses to AF_CUST_UDP*. Note that .sock_domain is still there as it's used as a selector for the protocol struct to be used.	2020-09-16 22:08:07 +02:00
Willy Tarreau	f1f660978c	MINOR: protocol: retrieve the family-specific fields from the family We now take care of retrieving sock_family, l3_addrlen, bind(), addrcmp(), get_src() and get_dst() from the protocol family and not just the protocol itself. There are very few places, this was only seldom used. Interestingly in sock_inet.c used to rely on ->sock_family instead of ->sock_domain, and sock_unix.c used to hard-code PF_UNIX instead of using ->sock_domain. Also it appears obvious we have something wrong it the protocol selection algorithm because sock_domain is the one set to the custom protocols while it ought to be sock_family instead, which would avoid having to hard-code some conversions for UDP namely.	2020-09-16 22:08:07 +02:00
Willy Tarreau	b0254cb361	MINOR: protocol: add a new proto_fam structure for protocol families We need to specially handle protocol families which regroup common functions used for a given address family. These functions include bind(), addrcmp(), get_src() and get_dst() for now. Some fields are also added about the address family, socket domain (protocol family passed to the socket() syscall), and address length. These protocol families are referenced from the protocols but not yet used.	2020-09-16 22:08:07 +02:00
Willy Tarreau	ad33acf838	MEDIUM: protocol: do not call proto->bind() anymore from bind_listener() All protocol's listeners now only take care of themselves and not of the receiver anymore since that's already being done in proto_bind_all(). Now it finally becomes obvious that UDP doesn't need a listener, as the only thing it does is to set the listener's state to LI_LISTEN!	2020-09-16 22:08:07 +02:00
Willy Tarreau	fc974887ce	MEDIUM: protocol: explicitly start the receiver before the listener Now protocol_bind_all() starts the receivers before their respective listeners so that ultimately we won't need the listeners for non- connected protocols. We still have to resort to an ugly trick to set the I/O handler in case of syslog over UDP because for now it's still not set in the receiver, so we hard-code it.	2020-09-16 22:08:07 +02:00
Willy Tarreau	9eda7a6d62	MEDIUM: proto_sockpair: make use of sockpair_bind_receiver() Now we rely on the address family's receiver instead of binding everything ourselves.	2020-09-16 22:08:07 +02:00
Willy Tarreau	62292b28a3	MEDIUM: sockpair: implement sockpair_bind_receiver() Note that for now we don't have a sockpair.c file to host that unusual family, so the new function was placed directly into proto_sockpair.c. It's no big deal given that this family is currently not shared with multiple protocols. The function does almost nothing but setting up the receiver. This is normal as the socket the FDs are passed onto are supposed to have been already created somewhere else, and the only usable identifier for such a socket pair is the receiving FD itself. The function was assigned to sockpair's ->bind() and is not used yet.	2020-09-16 22:08:07 +02:00
Willy Tarreau	cd5e5eaf50	MEDIUM: uxst: make use of sock_unix_bind_receiver() This removes all the AF_UNIX-specific code from uxst_bind_listener() and now simply relies on sock_unix_bind_listener() to do the same job. As mentionned in previous commit, the only difference is that now an unlikely failure on listen() will not result in a roll back of the temporary socket names since they will have been renamed during the bind() operation (as expected). But such failures do not correspond to any normal case and mostly denote operating system issues so there's no functionality loss here.	2020-09-16 22:08:07 +02:00
Willy Tarreau	1e0a860099	MEDIUM: sock_unix: implement sock_unix_bind_receiver() This function performs all the bind-related stuff for UNIX sockets that was previously done in uxst_bind_listener(). There is a very tiny difference however, which is that previously, in the unlikely event where listen() would fail, it was still possible to roll back the binding and rename the backup to the original socket. Now we have to rename it before calling returning, hence it will be done before calling listen(). However, this doesn't cover any particular use case since listen() has no reason to fail there (and the rollback is not done for inherited sockets), that was just done that way as a generic error processing path. The code is not used yet and is referenced in the uxst proto's ->bind().	2020-09-16 22:08:07 +02:00
Willy Tarreau	2f7687d0e8	MEDIUM: udp: make use of sock_inet_bind_receiver() This removes all the AF_INET-specific code from udp_bind_listener() and now simply relies on sock_inet_bind_listener() to do the same job. The function is now basically just a wrapper around sock_inet_bind_receiver().	2020-09-16 22:08:07 +02:00
Willy Tarreau	af9a7f5bb0	MEDIUM: tcp: make use of sock_inet_bind_receiver() This removes all the AF_INET-specific code from tcp_bind_listener() and now simply relies on sock_inet_bind_listener() to do the same job. The function was now roughly cut in half and its error path significantly simplified.	2020-09-16 22:08:07 +02:00
Willy Tarreau	d69ce1ffbc	MEDIUM: sock_inet: implement sock_inet_bind_receiver() This function collects all the receiver-specific code from both tcp_bind_listener() and udp_bind_listener() in order to provide a more generic AF_INET/AF_INET6 socket binding function. For now the API is not very elegant because some info are still missing from the receiver while there's no ideal place to fill them except when calling ->listen() at the protocol level. It looks like some polishing code is needed in check_config_validity() or somewhere around this in order to finalize the receivers' setup. The main issue is that listeners and receivers are created before bind_conf options are parsed and that there's no finishing step to resolve some of them. The function currently sets up a receiver and subscribes it to the poller. In an ideal world we wouldn't subscribe it but let the caller do it after having finished to configure the L4 stuff. The problem is that the caller would then need to perform an fd_insert() call and to possibly set the exported flag on the FD while it's not its job. Maybe an improvement could be to have a separate sock_start_receiver() call in sock.c. For now the function is not used but it will soon be. It's already referenced as tcp and udp's ->bind().	2020-09-16 22:08:07 +02:00
Willy Tarreau	b3580b19c8	MINOR: protocol: rename the ->bind field to ->listen The function currently is doing both the bind() and the listen(), so let's call it ->listen so that the bind() operation can move to another place.	2020-09-16 22:08:07 +02:00
Willy Tarreau	c049c0d5ad	MINOR: sock: make sock_find_compatible_fd() only take a receiver We don't need to have a listener anymore to find an fd, a receiver with its settings properly set is enough now.	2020-09-16 22:08:07 +02:00
Willy Tarreau	3fd3bdc836	MINOR: receiver: move the FOREIGN and V6ONLY options from listener to settings The new RX_O_FOREIGN, RX_O_V6ONLY and RX_O_V4V6 options are now set into the rx_settings part during the parsing, so that we don't need to adjust them in each and every listener anymore. We have to keep both v4v6 and v6only due to the precedence from v6only over v4v6.	2020-09-16 22:08:07 +02:00
Willy Tarreau	43046fa4f4	MINOR: listener: move the INHERITED flag down to the receiver It's the receiver's FD that's inherited from the parent process, not the listener's so the flag must move to the receiver so that appropriate actions can be taken.	2020-09-16 22:08:07 +02:00
Willy Tarreau	0b9150155e	MINOR: receiver: add a receiver-specific flag to indicate the socket is bound In order to split the receiver from the listener, we'll need to know that a socket is already bound and ready to receive. We used to do that via tha LI_O_ASSIGNED state but that's not sufficient anymore since the receiver might not belong to a listener anymore. The new RX_F_BOUND flag is used for this.	2020-09-16 22:08:07 +02:00
Willy Tarreau	818a92e87a	MINOR: listener: prefer to retrieve the socket's settings via the receiver Some socket settings used to be retrieved via the listener and the bind_conf. Now instead we use the receiver and its settings whenever appropriate. This will simplify the removal of the dependency on the listener.	2020-09-16 22:08:07 +02:00
Willy Tarreau	eef454224d	MINOR: receiver: link the receiver to its owner A receiver will have to pass a context to be installed into the fdtab for use by the handler. We need to set this into the receiver struct as the bind will happen longer after the configuration.	2020-09-16 22:08:07 +02:00
Willy Tarreau	0fce6bce34	MINOR: receiver: link the receiver to its settings Just like listeners keep a pointer to their bind_conf, receivers now also have a pointer to their rx_settings. All those belonging to a listener are automatically initialized with a pointer to the bind_conf's settings.	2020-09-16 22:08:07 +02:00
Willy Tarreau	4dfabfed13	MINOR: listener: make sock_find_compatible_fd() check the socket type sock_find_compatible_fd() can now access the protocol via the receiver hence it can access its socket type and know whether the receiver has dgram or stream sockets, so we don't need to hack around AF_CUST_UDP* anymore there.	2020-09-16 22:08:07 +02:00
Willy Tarreau	b743661f04	REORG: listener: move the listener's proto to the receiver The receiver is the one which depends on the protocol while the listener relies on the receiver. Let's move the protocol there. Since there's also a list element to get back to the listener from the proto list, this list element (proto_list) was moved as well. For now when scanning protos, we still see listeners which are linked by their rx.proto_list part.	2020-09-16 22:08:05 +02:00
Willy Tarreau	38ba647f9f	REORG: listener: move the receiving FD to struct receiver The listening socket is represented by its file descriptor, which is generic to all receivers and not just listeners, so it must move to the rx struct. It's worth noting that in order to extend receivers and listeners to other protocols such as QUIC, we'll need other handles than file descriptors here, and that either a union or a cast to uintptr_t will have to be used. This was not done yet and the field was preserved under the name "fd" to avoid adding confusion.	2020-09-16 22:08:03 +02:00
Willy Tarreau	371590661e	REORG: listener: move the listening address to a struct receiver The address will be specific to the receiver so let's move it there.	2020-09-16 22:08:01 +02:00
Willy Tarreau	be56c1038f	MINOR: listener: move the network namespace to the struct settings The netns is common to all listeners/receivers and is used to bind the listening socket so it must be in the receiver settings and not in the listener. This removes some yet another set of unnecessary loops.	2020-09-16 20:13:13 +02:00
Willy Tarreau	7e307215e8	MINOR: listener: move the interface to the struct settings The interface is common to all listeners/receivers and is used to bind the listening socket so it must be in the receiver settings and not in the listener. This removes some unnecessary loops.	2020-09-16 20:13:13 +02:00
Willy Tarreau	e26993c098	MINOR: listener: move bind_proc and bind_thread to struct settings As mentioned previously, these two fields come under the settings struct since they'll be used to bind receivers as well.	2020-09-16 20:13:13 +02:00
Willy Tarreau	6e459d7f92	MINOR: listener: create a new struct "settings" in bind_conf There currently is a large inconsistency in how binding parameters are split between bind_conf and listeners. It happens that for historical reasons some parameters are available at the listener level but cannot be configured per-listener but only for a bind_conf, and thus, need to be replicated. In addition, some of the bind_conf parameters are in fact for the listening socket itself while others are for the instanciated sockets. A previous attempt at splitting listeners into receivers failed because the boundary between all these settings is not well defined. This patch introduces a level of listening socket settings in the bind_conf, that will be detachable later. Such settings that are solely for the listening socket are: - unix socket permissions (used only during binding) - interface (used for binding) - network namespace (used for binding) - process mask and thread mask (used during startup) The rest seems to be used only to initialize the resulting sockets, or to control the accept rate. For now, only the unix params (bind_conf->ux) were moved there.	2020-09-16 20:13:13 +02:00
Willy Tarreau	e42d87f3de	BUG/MINOR: dns: gracefully handle the "udp@" address format for nameservers Just like with previous commit, DNS nameservers are affected as well with addresses starting in "udp@", but here it's different, because due to another bug in the DNS parser, the address is rejected, indicating that it doesn't have a ->connect() method. Similarly, the DNS code believes it's working on top of TCP at this point and this used to work because of this. The same fix is applied to remap the protocol and the ->connect test was dropped. No backport is needed, as the ->connect() test will never strike in 2.2 or below.	2020-09-16 20:11:52 +02:00
Willy Tarreau	e1c4c80441	BUG/MINOR: log: gracefully handle the "udp@" address format for log servers Commit `3835c0dcb` ("MEDIUM: udp: adds minimal proto udp support for message listeners.") introduced a problematic side effect in log server address parser: if "udp@", "udp4@" or "udp6@" prefixes a log server's address, the adress is passed as-is to the log server with a non-existing family and fails like this when trying to send: [ALERT] 259/195708 (3474) : socket() failed in logger #1: Address family not supported by protocol (errno=97) The problem is that till now there was no UDP family, so logs expect an AF_INET family to be passed for UDP there. This patch manually remaps AF_CUST_UDP4 and AF_CUST_UDP6 to their "tcp" equivalent that the log server parser expects. No backport is needed.	2020-09-16 20:11:52 +02:00

... 4 5 6 7 8 ...

10520 Commits