haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-15 03:26:59 +02:00

Author	SHA1	Message	Date
Willy Tarreau	4994b57728	MINOR: vars: add a VF_CREATEONLY flag for creation Passing this flag to var_set() will result in the variable to only be created if it did not exist, otherwise nothing is done (it's not even updated). This will be used for pre-registering names.	2021-09-08 11:47:30 +02:00
Willy Tarreau	7978c5c422	MEDIUM: vars: make the ifexist variant of set-var only apply to the proc scope When setting variables, there are currently two variants, one which will always create the variable, and another one, "ifexist", which will only create or update a variable if a similarly named variable in any scope already existed before. The goal was to limit the risk of injecting random names in the proc scope, but it was achieved by making use of the somewhat limited name indexing model, which explains the scope-agnostic restriction. With this change, we're moving the check downwards in the chain, at the variable level, and only variables under the scope "proc" will be subject to the restriction. A new set of VF_* flags was added to adjust how variables are set, and VF_UPDATEONLY is used to mention this restriction. In this exact state of affairs, this is not completely exact, as if a similar name was not known in any scope, the variable will continue to be rejected like before, but this will change soon.	2021-09-08 11:47:06 +02:00
Willy Tarreau	b7bfcb3ff3	MINOR: vars: rename vars_init() to vars_init_head() The vars_init() name is particularly confusing as it does not initialize the variables code but the head of a list of variables passed in arguments. And we'll soon need to have proper initialization code, so let's rename it now.	2021-09-08 11:10:16 +02:00
Willy Tarreau	10080716bf	MINOR: proxy: add a global "grace" directive to postpone soft-stop In ticket #1348 some users expressed some concerns regarding the removal of the "grace" directive from the proxies. Their use case very closely mimmicks the original intent of the grace keyword, which is, let haproxy accept traffic for some time when stopping, while indicating an external LB that it's stopping. This is implemented here by starting a task whose expiration triggers the soft-stop for real. The global "stopping" variable is immediately set however. For example, this below will be sufficient to instantly notify an external check on port 9999 that the service is going down, while other services remain active for 10s: global grace 10s frontend ext-check bind :9999 monitor-uri /ext-check monitor fail if { stopping }	2021-09-07 17:34:29 +02:00
Willy Tarreau	3b69886f7d	BUG/MAJOR: htx: fix missing header name length check in htx_add_header/trailer Ori Hollander of JFrog Security reported that htx_add_header() and htx_add_trailer() were missing a length check on the header name. While this does not allow to overwrite any memory area, it results in bits of the header name length to slip into the header value length and may result in forging certain header names on the input. The sad thing here is that a FIXME comment was present suggesting to add the required length checks :-( The injected headers are visible to the HTTP internals and to the config rules, so haproxy will generally stay synchronized with the server. But there is one exception which is the content-length header field, because it is already deduplicated on the input, but before being indexed. As such, injecting a content-length header after the deduplication stage may be abused to present a different, shorter one on the other side and help build a request smuggling attack, or even maybe a response splitting attack. CVE-2021-40346 was assigned to this problem. As a mitigation measure, it is sufficient to verify that no more than one such header is present in any message, which is normally the case thanks to the duplicate checks: http-request deny if { req.hdr_cnt(content-length) gt 1 } http-response deny if { res.hdr_cnt(content-length) gt 1 } This must be backported to all HTX-enabled versions, hence as far as 2.0. In 2.3 and earlier, the functions are in src/htx.c instead. Many thanks to Ori for his work and his responsible report!	2021-09-03 16:15:29 +02:00
Willy Tarreau	3d5f19e04d	CLEANUP: htx: remove comments about "must be < 256 MB" Since commit "BUG/MINOR: config: reject configs using HTTP with bufsize >= 256 MB" we are now sure that it's not possible anymore to have an HTX block of a size 256 MB or more, even after concatenation thanks to the tests for len >= htx_free_data_space(). Let's remove these now obsolete comments. A BUG_ON() was added in htx_add_blk() to track any such exception if the conditions would change later, to complete the one that is performed on the start address that must remain within the buffer.	2021-09-03 16:15:29 +02:00
Willy Tarreau	e352b9dac7	MINOR: vars: make vars_get_by_* support an optional default value In preparation for support default values when fetching variables, we need to update the internal API to pass an extra argument to functions vars_get_by_{name,desc} to provide an optional default value. This patch does this and always passes NULL in this argument. var_to_smp() was extended to fall back to this value when available.	2021-09-03 12:08:54 +02:00
Willy Tarreau	9a621ae76d	MEDIUM: vars: add a new "set-var-fmt" action The set-var() action is convenient because it preserves the input type but it's a pain to deal with when trying to concatenate values. The most recurring example is when it's needed to build a variable composed of the source address and the source port. Usually it ends up like this: tcp-request session set-var(sess.port) src_port tcp-request session set-var(sess.addr) src,concat(":",sess.port) This is even worse when trying to aggregate multiple fields from stick-table data for example. Due to this a lot of users instead abuse headers from HTTP rules: http-request set-header(x-addr) %[src]:%[src_port] But this requires some careful cleanups to make sure they won't leak, and it's significantly more expensive to deal with. And generally speaking it's not clean. Plus it must be performed for each and every request, which is expensive for this common case of ip+port that doesn't change for the whole session. This patch addresses this limitation by implementing a new "set-var-fmt" action which performs the same work as "set-var" but takes a format string in argument instead of an expression. This way it becomes pretty simple to just write: tcp-request session set-var-fmt(sess.addr) %[src]:%[src_port] It is usable in all rulesets that already support the "set-var" action. It is not yet implemented for the global "set-var" directive (which already takes a string) and the CLI's "set var" command, which would definitely benefit from it but currently uses its own parser and engine, thus it must be reworked. The doc and regtests were updated.	2021-09-02 21:22:22 +02:00
Willy Tarreau	57467b8356	MINOR: sample: add missing ARGC_ entries For a long time we couldn't have arguments in expressions used in tcp-request, tcp-response etc rules. But now due to the variables it's possible, and their context in case of failure to resolve an argument (e.g. backend name not found) is not properly reported because there is no arg context values in ARGC_* to report them. Let's add a number of missing ones for tcp-request {connection, session,content}, tcp-response content, tcp-check, the config parser (for "set-var" in the global section) and the CLI parser (for "set-var" on the CLI).	2021-09-02 19:43:20 +02:00
Willy Tarreau	bc1223be79	MINOR: http-rules: add a new "ignore-empty" option to redirects. Sometimes it is convenient to remap large sets of URIs to new ones (e.g. after a site migration for example). This can be achieved using "http-request redirect" combined with maps, but one difficulty there is that non-matching entries will return an empty response. In order to avoid this, duplicating the operation as an ACL condition ending in "-m found" is possible but it becomes complex and error-prone while it's known that an empty URL is not valid in a location header. This patch addresses this by improving the redirect rules to be able to simply ignore the rule and skip to the next one if the result of the evaluation of the "location" expression is empty. However in order not to break existing setups, it requires a new "ignore-empty" keyword. There used to be an ACT_FLAG_FINAL on redirect rules that's used during the parsing to emit a warning if followed by another rule, so here we only set it if the option is not there. The http_apply_redirect_rule() function now returns a 3rd value to mention that it did nothing and that this was not an error, so that callers can just ignore the rule. The regular "redirect" rules were not modified however since this does not apply there. The map_redirect VTC was completed with such a test and updated to 2.5 and an example was added into the documentation.	2021-09-02 17:06:18 +02:00
Tim Duesterhus	abc6b31ab8	CLEANUP: Add missing include guard to signal.h Found using GitHub's CodeQL scan.	2021-09-01 21:39:19 +02:00
Willy Tarreau	87154e3010	BUG/MAJOR: queue: better protect a pendconn being picked from the proxy The locking in the dequeuing process was significantly improved by commit `49667c14b` ("MEDIUM: queue: take the proxy lock only during the px queue accesses") in that it tries hard to limit the time during which the proxy's queue lock is held to the strict minimum. Unfortunately it's not enough anymore, because we take up the task and manipulate a few pendconn elements after releasing the proxy's lock (while we're under the server's lock) but the task will not necessarily hold the server lock since it may not have successfully found one (e.g. timeout in the backend queue). As such, stream_free() calling pendconn_free() may release the pendconn immediately after the proxy's lock is released while the other thread currently proceeding with the dequeuing tries to wake up the owner's task and dies in task_wakeup(). One solution consists in releasing le proxy's lock later. But tests have shown that we'd have to sacrifice a significant share of the performance gained with the patch above (roughly a 20% loss). This patch takes another approach. It adds a "del_lock" to each pendconn struct, that allows to keep it referenced while the proxy's lock is being released. It's mostly a serialization lock like a refcount, just to maintain the pendconn alive till the task_wakeup() call is complete. This way we can continue to release the proxy's lock early while keeping this one. It had to be added to the few points where we're about to free a pendconn, namely in pendconn_dequeue() and pendconn_unlink(). This way we continue to release the proxy's lock very early and there is no performance degradation. This lock may only be held under the queue's lock to prevent lock inversion. No backport is needed since the patch above was merged in 2.5-dev only.	2021-08-31 18:37:13 +02:00
Remi Tricot-Le Breton	fe21fe76bd	MINOR: log: Add new "error-log-format" option This option can be used to define a specific log format that will be used in case of error, timeout, connection failure on a frontend... It will be used for any log line concerned by the log-separate-errors option. It will also replace the format of specific error messages decribed in section 8.2.6. If no "error-log-format" is defined, the legacy error messages are still emitted and the other error logs keep using the regular log-format.	2021-08-31 12:13:08 +02:00
Willy Tarreau	ea57a9b103	BUILD: ssl: next round of build warnings on LIBRESSL_VERSION_NUMBER Other build warnings were emitted on LIBRESSL_VERSION_NUMBER with -Wundef under openssl < 1.1. Related to GH issue #1369. Seems like some of them could be simplified a little bit.	2021-08-30 06:20:46 +02:00
Willy Tarreau	a01f8ce2d4	BUILD/MINOR: regex: avoid a build warning on USE_PCRE2 with -Wundef regex-t emits a warning on #elif USE_PCRE2 when built with -Wundef, let's just fix it. This was reported in GH issue #1369.	2021-08-28 12:49:58 +02:00
Willy Tarreau	6e5542e9f4	BUILD/MINOR: ssl: avoid a build warning on LIBRESSL_VERSION with -Wundef Openssl-compat emits a warning for the test on LIBRESSL_VERSION that might be underfined, if built with -Wundef. The fix is easy, let's do it. Related to GH issue #1369.	2021-08-28 12:06:51 +02:00
Willy Tarreau	33056436c7	BUILD/MINOR: defaults: eliminate warning on MAXHOSTNAMELEN with -Wundef As reported in GH issue #1369, there is a single case of #if with a possibly undefined value in defaults.h which is on MAXHOSTNAMELEN. Let's turn it to a #ifdef.	2021-08-28 12:05:32 +02:00
Willy Tarreau	cbdc74b4b3	BUG/MINOR: ebtree: remove dependency on incorrect macro for bits per long The code used to rely on BITS_PER_LONG to decide on the most efficient way to perform a 64-bit shift, but this macro is not defined (at best it's __BITS_PER_LONG) and it's likely that it's been like this since the early implementation of ebtrees designed on i386. Let's remove the test on this macro and rely on sizeof(long) instead, it also has the benefit of letting the compiler validate the two branches. This can be backported to all versions. Thanks to Ezequiel Garcia for reporting this one in issue #1369.	2021-08-28 11:55:53 +02:00
Willy Tarreau	fe456c581f	MINOR: time: add report_idle() to report process-wide idle time Before threads were introduced in 1.8, idle_pct used to be a global variable indicating the overall process idle time. Threads made it thread-local, meaning that its reporting in the stats made little sense, though this was not easy to spot. In 2.0, the idle_pct variable moved to the struct thread_info via commit `81036f273` ("MINOR: time: move the cpu, mono, and idle time to thread_info"). It made it more obvious that the idle_pct was per thread, and also allowed to more accurately measure it. But no more effort was made in that direction. This patch introduces a new report_idle() function that accurately averages the per-thread idle time over all running threads (i.e. it should remain valid even if some threads are paused or stopped), and makes use of it in the stats / "show info" reports. Sending traffic over only two connections of an 8-thread process would previously show this erratic CPU usage pattern: $ while :; do socat /tmp/sock1 - <<< "show info"\|grep ^Idle;sleep 0.1;done Idle_pct: 30 Idle_pct: 35 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 35 Idle_pct: 33 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Idle_pct: 100 Now it shows this more accurate measurement: $ while :; do socat /tmp/sock1 - <<< "show info"\|grep ^Idle;sleep 0.1;done Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 Idle_pct: 83 This is not technically a bug but this lack of precision definitely affects some users who rely on the idle_pct measurement. This should at least be backported to 2.4, and might be to some older releases depending on users demand.	2021-08-28 11:18:10 +02:00
Willy Tarreau	e365aa28d4	BUG/MINOR: time: fix idle time computation for long sleeps In 2.4 we extended the max poll time from 1s to 60s with commit `4f59d3861` ("MINOR: time: increase the minimum wakeup interval to 60s"). This had the consequence that the calculation of the idle time percentage may overflow during the multiply by 100 if the thread had slept 43s or more. Let's change this to a 64 bit computation. This will have no performance impact since this is done at most twice per second. This should fix github issue #1366. This must be backported to 2.4.	2021-08-27 23:36:20 +02:00
Marcin Deranek	310a260e4a	MEDIUM: config: Deprecate tune.ssl.capture-cipherlist-size Deprecate tune.ssl.capture-cipherlist-size in favor of tune.ssl.capture-buffer-size which better describes the purpose of the setting.	2021-08-26 19:52:04 +02:00
Marcin Deranek	959a48c116	MINOR: sample: Expose SSL captures using new fetchers To be able to provide JA3 compatible TLS Fingerprints we need to expose all Client Hello captured data using fetchers. Patch provides new and modifies existing fetchers to add ability to filter out GREASE values: - ssl_fc_cipherlist_* - ssl_fc_ecformats_bin - ssl_fc_eclist_bin - ssl_fc_extlist_bin - ssl_fc_protocol_hello_id	2021-08-26 19:48:34 +02:00
Marcin Deranek	769fd2e447	MEDIUM: ssl: Capture more info from Client Hello When we set tune.ssl.capture-cipherlist-size to a non-zero value we are able to capture cipherlist supported by the client. To be able to provide JA3 compatible TLS fingerprinting we need to capture more information from Client Hello message: - SSL Version - SSL Extensions - Elliptic Curves - Elliptic Curve Point Formats This patch allows HAProxy to capture such information and store it for later use.	2021-08-26 19:48:33 +02:00
Willy Tarreau	906f7daed1	MINOR: compiler: implement an ONLY_ONCE() macro There are regularly places, especially in config analysis, where we need to report certain things (warnings or errors) only once, but where implementing a counter is sufficiently deterrent so that it's not done. Let's add a simple ONLY_ONCE() macro that implements a static variable (char) which is atomically turned on, and returns true if it's set for the first time. This uses fairly compact code, a single byte of BSS and is thread-safe. There are probably a number of places in the config parser where this could be used. It may also be used to implement a WARN_ON() similar to BUG_ON() but which would only warn once.	2021-08-26 16:35:00 +02:00
Amaury Denoyelle	5cca48cba2	MINOR: server: define non purgeable server flag Define a flag to mark a server as non purgeable. This flag will be used for "delete server" CLI handler. All servers without this flag will be eligible to runtime suppression.	2021-08-25 15:53:54 +02:00
Amaury Denoyelle	bc2ebfa5a4	MEDIUM: server: extend refcount for all servers In a future patch, it will be possible to remove at runtime every servers, both static and dynamic. This requires to extend the server refcount for all instances. First, refcount manipulation functions have been renamed to better express the API usage. * srv_refcount_use -> srv_take The refcount is always initialize to 1 on the server creation in new_server. It's also incremented for each check/agent configured on a server instance. * free_server -> srv_drop This decrements the refcount and if null, the server is freed, so code calling it must not use the server reference after it. As a bonus, this function now returns the next server instance. This is useful when calling on the server loop without having to save the next pointer before each invocation. In these functions, remove the checks that prevent refcount on non-dynamic servers. Each reference to "dynamic" in variable/function naming have been eliminated as well.	2021-08-25 15:53:54 +02:00
Amaury Denoyelle	0a8d05d31c	BUG/MINOR: stats: use refcount to protect dynamic server on dump A dynamic server may be deleted at runtime at the same moment when the stats applet is pointing to it. Use the server refcount to prevent deletion in this case. This should be backported up to 2.4, with an observability period of 2 weeks. Note that it requires the dynamic server refcounting feature which has been implemented on 2.5; the following commits are required : - MINOR: server: implement a refcount for dynamic servers - BUG/MINOR: server: do not use refcount in free_server in stopping mode - MINOR: server: return the next srv instance on free_server	2021-08-25 15:53:43 +02:00
Amaury Denoyelle	f5c1e12e44	MINOR: server: return the next srv instance on free_server As a convenience, return the next server instance from servers list on free_server. This is particularily useful when using this function on the servers list without having to save of the next pointer before calling it.	2021-08-25 15:29:19 +02:00
Ilya Shipitsin	ff0f278860	CLEANUP: assorted typo fixes in the code and comments This is 26th iteration of typo fixes	2021-08-25 05:13:31 +02:00
William Lallemand	3aeb3f9347	MINOR: cfgcond: implements openssl_version_atleast and openssl_version_before Implements a way of checking the running openssl version: If the OpenSSL support was not compiled within HAProxy it will returns a error, so it's recommanded to do a SSL feature check before: $ ./haproxy -cc 'feature(OPENSSL) && openssl_version_atleast(0.9.8zh) && openssl_version_before(3.0.0)' This will allow to select the SSL reg-tests more carefully.	2021-08-22 00:30:24 +02:00
William Lallemand	44d862d8d4	MINOR: ssl: add an openssl version string parser openssl_version_parser() parse a string in the OpenSSL version format which is documented here: https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_VERSION_NUMBER.html The function returns an unsigned int that could be used for comparing openssl versions.	2021-08-21 23:44:02 +02:00
William Lallemand	2a8fe8bb48	MINOR: httpclient: cleanup the include files Include the correct .h files in http_client.c and http_client.h. The api.h is needed in http_client.c and http_client-t.h is now include directly from http_client.h	2021-08-20 14:25:15 +02:00
Remi Tricot-Le Breton	f95c29546c	BUILD/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 The X509_STORE_CTX_get0_cert did not exist yet on OpenSSL 1.0.2 and neither did X509_STORE_CTX_get0_chain, which was not actually needed since its get1 equivalent already existed.	2021-08-20 10:05:58 +02:00
Remi Tricot-Le Breton	74f6ab6e87	MEDIUM: ssl: Keep a reference to the client's certificate for use in logs Most of the SSL sample fetches related to the client certificate were based on the SSL_get_peer_certificate function which returns NULL when the verification process failed. This made it impossible to use those fetches in a log format since they would always be empty. The patch adds a reference to the X509 object representing the client certificate in the SSL structure and makes use of this reference in the fetches. The reference can only be obtained in ssl_sock_bind_verifycbk which means that in case of an SSL error occurring before the verification process ("no shared cipher" for instance, which happens while processing the Client Hello), we won't ever start the verification process and it will be impossible to get information about the client certificate. This patch also allows most of the ssl_c_XXX fetches to return a usable value in case of connection failure (because of a verification error for instance) by making the "conn->flags & CO_FL_WAIT_XPRT" test (which requires a connection to be established) less strict. Thanks to this patch, a log-format such as the following should return usable information in case of an error occurring during the verification process : log-format "DN=%{+Q}[ssl_c_s_dn] serial=%[ssl_c_serial,hex] \ hash=%[ssl_c_sha1,hex]" It should answer to GitHub issue #693.	2021-08-19 23:26:05 +02:00
William Lallemand	33b0d095cc	MINOR: httpclient: implement a simple HTTP Client API This commit implements a very simple HTTP Client API. A client can be operated by several functions: - httpclient_new(), httpclient_destroy(): create and destroy the struct httpclient instance. - httpclient_req_gen(): generate a complete HTX request using the the absolute URL, the method and a list of headers. This request is complete and sets the HTX End of Message flag. This is limited to small request we don't need a body. - httpclient_start() fill a sockaddr storage with a IP extracted from the URL (it cannot resolve an fqdm for now), start the applet. It also stores the ptr of the caller which could be an appctx or something else. - hc->ops contains a list of callbacks used by the HTTPClient, they should be filled manually after an httpclient_new(): * res_stline(): the client received a start line, its content will be stored in hc->res.vsn, hc->res.status, hc->res.reason * res_headers(): the client received headers, they are stored in hc->res.hdrs. * res_payload(): the client received some payload data, they are stored in the hc->res.buf buffer and could be extracted with the httpclient_res_xfer() function, which takes a destination buffer as a parameter * res_end(): this callback is called once we finished to receive the response.	2021-08-18 17:36:32 +02:00
Willy Tarreau	d3d8d03d98	MINOR: http: add a new function http_validate_scheme() to validate a scheme While http_parse_scheme() extracts a scheme from a URI by extracting exactly the valid characters and stopping on delimiters, this new function performs the same on a fixed-size string.	2021-08-17 10:16:22 +02:00
Ilya Shipitsin	01881087fc	CLEANUP: assorted typo fixes in the code and comments This is 25th iteration of typo fixes	2021-08-16 12:37:59 +02:00
Christopher Faulet	df97ac4584	MEDIUM: filters/lua: Add HTTPMessage class to help HTTP filtering This new class exposes methods to manipulate HTTP messages from a filter written in lua. Like for the HTTP class, there is a bunch of methods to manipulate the message headers. But there are also methods to manipulate the message payload. This part is similar to what is available in the Channel class. Thus the payload can be duplicated, erased, modified or forwarded. For now, only DATA blocks can be retrieved and modified because the current API is limited. No HTTPMessage method is able to yield. Those manipulating the headers are always called on messages containing all the headers, so there is no reason to yield. Those manipulating the payload are called from the http_payload filters callback function where yielding is forbidden. When an HTTPMessage object is instantiated, the underlying Channel object can be retrieved via the ".channel" field. For now this class is not used because the HTTP filtering is not supported yet. It will be the purpose of another commit. There is no documentation for now.	2021-08-12 08:57:07 +02:00
Christopher Faulet	8c9e6bba0f	MINOR: lua: Add flags on the lua TXN to know the execution context A lua TXN can be created when a sample fetch, an action or a filter callback function is executed. A flag is now used to track the execute context. Respectively, HLUA_TXN_SMP_CTX, HLUA_TXN_ACT_CTX and HLUA_TXN_FLT_CTX. The filter flag is not used for now.	2021-08-12 08:57:07 +02:00
Christopher Faulet	1f43a3430e	MINOR: lua: Add a flag on lua context to know the yield capability at run time When a script is executed, a flag is used to allow it to yield. An error is returned if a lua function yield, explicitly or not. But there is no way to get this capability in C functions. So there is no way to choose to yield or not depending on this capability. To fill this gap, the flag HLUA_NOYIELD is introduced and added on the lua context if the current script execution is not authorized to yield. Macros to set, clear and test this flags are also added. This feature will be usefull to fix some bugs in lua actions execution.	2021-08-12 08:57:07 +02:00
William Lallemand	8c29fa7454	MINOR: channel: remove an htx block from a channel co_htx_remove_blk() implements a way to remove an htx block from a channel buffer and update the channel output.	2021-08-12 00:51:59 +02:00
Amaury Denoyelle	7afa5c1843	MINOR: global: define MODE_STOPPING Define a new mode MODE_STOPPING. It is used to indicate that the process is in the stopping stage and no event loop runs anymore.	2021-08-09 17:51:55 +02:00
Amaury Denoyelle	b33a0abc0b	MEDIUM: check: implement check deletion for dynamic servers Implement a mechanism to free a started check on runtime for dynamic servers. A new function check_purge is created for this. The check task will be marked for deletion and scheduled to properly close connection elements and free the task/tasklet/buf_wait elements. This function will be useful to delete a dynamic server wich checks.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	d6b7080cec	MINOR: server: implement a refcount for dynamic servers It is necessary to have a refcount mechanism on dynamic servers to be able to enable check support. Indeed, when deleting a dynamic server with check activated, the check will be asynchronously removed. This is mandatory to properly free the check resources in a thread-safe manner. The server instance must be kept alive for this.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	3c2ab1a0d4	MINOR: check: export check init functions Remove static qualifier on init_srv_check, init_srv_agent_check and start_check_task. These functions will be called in server.c for dynamic servers with checks.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	7b368339af	MEDIUM: task: implement tasklet kill Implement an equivalent of task_kill for tasklets. This function can be used to request a tasklet deletion in a thread-safe way. Currently this function is unused.	2021-08-06 11:07:48 +02:00
Christopher Faulet	434b8525ee	MINOR: spoe: Add a pointer on the filter config in the spoe_agent structure There was no way to access the SPOE filter configuration from the agent object. However it could be handy to have it. And in fact, this will be required to fix a bug.	2021-08-05 10:07:43 +02:00
Willy Tarreau	7b2ac29a92	CLEANUP: fd: remove the now unneeded fd_mig_lock This is not needed anymore since we don't use it when setting the running mask anymore.	2021-08-04 16:03:36 +02:00
Willy Tarreau	b201b1dab1	CLEANUP: fd: remove the now unused fd_set_running() It was inlined inside fd_update_events() since it relies on a loop that may return immediate failure codes.	2021-08-04 16:03:36 +02:00
Willy Tarreau	f69fea64e0	MAJOR: fd: get rid of the DWCAS when setting the running_mask Right now we're using a DWCAS to atomically set the running_mask while being constrained by the thread_mask. This DWCAS is annoying because we may seriously need it later when adding support for thread groups, for checking that the running_mask applies to the correct group. It turns out that the DWCAS is not strictly necessary because we never need it to set the thread_mask based on the running_mask, only the other way around. And in fact, the running_mask is always cleared alone, and the thread_mask is changed alone as well. The running_mask is only relevant to indicate a takeover when the thread_mask matches it. Any bit set in running and not present in thread_mask indicates a transition in progress. As such, it is possible to re-arrange this by using a regular CAS around a consistency check between running_mask and thread_mask in fd_update_events and by making a CAS on running_mask then an atomic store on the thread_mask in fd_takeover(). The only other case is fd_delete() but that one already sets the running_mask before clearing the thread_mask, which is compatible with the consistency check above. This change has happily survived 10 billion takeovers on a 16-thread machine at 800k requests/s. The fd-migration doc was updated to reflect this change.	2021-08-04 16:03:36 +02:00
Willy Tarreau	b1f29bc625	MINOR: activity/fd: remove the dead_fd counter This one is set whenever an FD is reported by a poller with a null owner, regardless of the thread_mask. It has become totally meaningless because it only indicates a migrated FD that was not yet reassigned to a thread, but as soon as a thread uses it, the status will change to skip_fd. Thus there is no reason to distinguish between the two, it adds more confusion than it helps. Let's simply drop it.	2021-08-04 16:03:36 +02:00
Willy Tarreau	88d1c5d3fb	MEDIUM: threads: add a stronger thread_isolate_full() call The current principle of running under isolation was made to access sensitive data while being certain that no other thread was using them in parallel, without necessarily having to place locks everywhere. The main use case are "show sess" and "show fd" which run over long chains of pointers. The thread_isolate() call relies on the "harmless" bit that indicates for a given thread that it's not currently doing such sensitive things, which is advertised using thread_harmless_now() and which ends usings thread_harmless_end(), which also waits for possibly concurrent threads to complete their work if they took this opportunity for starting something tricky. As some system calls were notoriously slow (e.g. mmap()), a bunch of thread_harmless_now() / thread_harmless_end() were placed around them to let waiting threads do their work while such other threads were not able to modify memory contents. But this is not sufficient for performing memory modifications. One such example is the server deletion code. By modifying memory, it not only requires that other threads are not playing with it, but are not either in the process of touching it. The fact that a pool_alloc() or pool_free() on some structure may call thread_harmless_now() and let another thread start to release the same object's memory is not acceptable. This patch introduces the concept of "idle threads". Threads entering the polling loop are idle, as well as those that are waiting for all others to become idle via the new function thread_isolate_full(). Once thread_isolate_full() is granted, the thread is not idle anymore, and it is released using thread_release() just like regular isolation. Its users have to keep in mind that across this call nothing is granted as another thread might have performed shared memory modifications. But such users are extremely rare and are actually expecting this from their peers as well. Note that that in case of backport, this patch depends on previous patch: MINOR: threads: make thread_release() not wait for other ones to complete	2021-08-04 14:49:36 +02:00
William Lallemand	8e765b86fd	MINOR: proxy: disabled takes a stopping and a disabled state This patch splits the disabled state of a proxy into a PR_DISABLED and a PR_STOPPED state. The first one is set when the proxy is disabled in the configuration file, and the second one is set upon a stop_proxy().	2021-08-03 14:17:45 +02:00
William Lallemand	56f1f75715	MINOR: log: rename 'dontloglegacyconnerr' to 'log-error-via-logformat' Rename the 'dontloglegacyconnerr' option to 'log-error-via-logformat' which is much more self-explanatory and readable. Note: only legacy keywords don't use hyphens, it is recommended to separate words with them in new keywords.	2021-08-02 10:42:42 +02:00
Willy Tarreau	99198546f6	MEDIUM: atomic: relax the load/store barriers on x86_64 The x86-tso model makes the load and store barriers unneeded for our usage as long as they perform at least a compiler barrier: the CPU will respect store ordering and store vs load ordering. It's thus safe to remove the lfence and sfence which are normally needed only to communicate with external devices. Let's keep the mfence though, to make sure that reads of same memory location after writes report the value from memory and not the one snooped from the write buffer for too long. An in-depth review of all use cases tends to indicate that this is okay in the rest of the code. Some parts could be cleaned up to use atomic stores and atomic loads instead of explicit barriers though. Doing this reliably increases the overall performance by about 2-2.5% on a 8c-16t Xeon thanks to less frequent flushes (it's likely that the biggest gain is in the MT lists which use them a lot, and that this results in less cache line flushes).	2021-08-01 17:34:06 +02:00
Willy Tarreau	cb0451146f	MEDIUM: atomic: simplify the atomic load/store/exchange operations The atomic_load/atomic_store/atomic_xchg operations were all forced to __ATOMIC_SEQ_CST, which results in explicit store or even full barriers even on x86-tso while we do not need them: we're not communicating with external devices for example and are only interested in respecting the proper ordering of loads and stores between each other. These ones being rarely used, the emitted code on x86 remains almost the same (barring a handful of locations). However they will allow to place correct barriers at other places where atomics are accessed a bit lightly. The patch is marked medium because we can never rule out the risk of some bugs on more relaxed platforms due to the rest of the code.	2021-08-01 17:34:06 +02:00
Willy Tarreau	55a0975b1e	BUG/MINOR: freq_ctr: use stricter barriers between updates and readings update_freq_ctr_period() was using relaxed atomics without using barriers, which usually works fine on x86 but not everywhere else. In addition, some values were read without being enclosed by barriers, allowing the compiler to possibly prefetch them a bit earlier. Finally, freq_ctr_total() was also reading these without enough barriers. Let's make explicit use of atomic loads and atomic stores to get rid of this situation. This required to slightly rearrange the freq_ctr_total() loop, which could possibly slightly improve performance under extreme contention by avoiding to reread all fields. A backport may be done to 2.4 if a problem is encountered, but last tests on arm64 with LSE didn't show any issue so this can possibly stay as-is.	2021-08-01 17:34:06 +02:00
Willy Tarreau	200bd50b73	MEDIUM: fd: rely more on fd_update_events() to detect changes This function already performs a number of checks prior to calling the IOCB, and detects the change of thread (FD migration). Half of the controls are still in each poller, and these pollers also maintain activity counters for various cases. Note that the unreliable test on thread_mask was removed so that only the one performed by fd_set_running() is now used, since this one is reliable. Let's centralize all that fd-specific logic into the function and make it return a status among: FD_UPDT_DONE, // update done, nothing else to be done FD_UPDT_DEAD, // FD was already dead, ignore it FD_UPDT_CLOSED, // FD was closed FD_UPDT_MIGRATED, // FD was migrated, ignore it now Some pollers already used to call it last and have nothing to do after it, regardless of the result. epoll has to delete the FD in case a migration is detected. Overall this removes more code than it adds.	2021-07-30 17:45:18 +02:00
Willy Tarreau	84c7922c52	REORG: fd: uninline fd_update_events() This function has become a monster (80 lines and 2/3 of a kB), it doesn't benefit from being static nor inline anymore, let's move it to fd.c.	2021-07-30 17:41:55 +02:00
Willy Tarreau	a199a17d72	MINOR: fd: update flags only once in fd_update_events() Since 2.4 with commit `f50906519` ("MEDIUM: fd: merge fdtab[].ev and state for FD_EV_* and FD_POLL_* into state") we can merge all flag updates at once in fd_update_events(). Previously this was performed in 1 to 3 steps, setting the polling state, then setting READY_R if in/err/hup, and setting READY_W if out/err. But since the commit above, all flags are stored together in the same structure field that is being updated with the new flags, thus we can simply update the flags altogether and avoid multiple atomic operations. This even removes the need for atomic ops for FDs that are not shared.	2021-07-30 17:41:55 +02:00
Willy Tarreau	d5402b8df8	BUG/MINOR: fd: protect fd state harder against a concurrent takeover There's a theoretical race (that we failed to trigger) in function fd_update_events(), which could strike on idle connections. The "locked" variable will most often be 0 as the FD is bound to the current thread only. Another thread could take it over once "locked" is set, change the thread and running masks. Then the first thread updates the FD's state non-atomically and possibly overwrites what the other thread was preparing. It still looks like the FD's state will ultimately converge though. The solution against this is to set the running flag earlier so that a takeover() attempt cannot succeed, or that the fd_set_running() attempt fails, indicating that nothing needs to be done on this FD. While this is sufficient for a simple fix to be backported, it leaves the FD actively polled in the calling thread, this will trigger a second wakeup which will notice the absence of tid_bit in the thread_mask, getting rid of it. A more elaborate solution would consist in calling fd_set_running() directly from the pollers before calling fd_update_events(), getting rid of the thread_mask test and letting the caller eliminate that FD from its list if needed. Interestingly, this code also proves to be suboptimal in that it sets the FD state twice instead of calculating the new state at once and always using a CAS to set it. This is a leftover of a simplification that went into 2.4 and which should be explored in a future patch. This may be backported as far as 2.2.	2021-07-30 14:54:19 +02:00
Willy Tarreau	6ed242ece6	BUG/MEDIUM: connection: close a rare race between idle conn close and takeover The takeover of idle conns between threads is particularly tricky, for two reasons: - there's no way to atomically synchronize kernel-side polling with userspace activity, so late events will always be reported for some FDs just migrated ; - upon error, an FD may be immediately reassigned to whatever other thread since it's process-wide. The current model uses the FD's thread_mask to figure if an FD still ought to be reported or not, and a per-thread idle connection queue from which eligible connections are atomically added/picked. I/Os coming from the bottom for such a connection must remove it from the list so that it's not elected. Same for timeout tasks and iocbs. And these last ones check their context under the idle_conn lock to judge if they're still allowed to run. One rare case was omitted: the wake() callback. This one is rare, it may serve to notify about finalized connect() calls that are not being polled, as well as unhandled shutdowns and errors. This callback was not protected till now because it wasn't seen as sensitive, but there exists a particular case where it may be called without protectoin in parallel to a takeover. This happens in the following sequence: - thread T1 wants to establish an outgoing connection - the connect() call returns EINPROGRESS - the poller adds it using epoll_ctl() - epoll_wait() reports it, connect() is done. The connection is not being marked as actively polled anymore but is still known from the poller. - the request is sent over that connection using send(), which queues to system buffers while data are being delivered - the scheduler switches to other tasks - the request is physically sent - the server responds - the stream is notified that send() succeeded, and makes progress, trying to recv() from that connection - the recv() succeeds, the response is delivered - the poller doesn't need to be touched (still no active polling) - the scheduler switches to other tasks - the server closes the connection - the poller on T1 is notified of the SHUTR and starts to call mux->wake() - another thread T2 takes over the connection - T2 continues to run inside wake() and releases the connection - T2 is just dereferencing it. - BAM. The most logical solution here is to surround the call to wake() with an atomic removal/insert of the connection from/into the idle conns lists. This way, wake() is guaranteed to run alone. Any other poller reporting the FD will not have its tid_bit in the thread_mask si will not bother it. Another thread trying a takeover will not find this connection. A task or tasklet being woken up late will either be on the same thread, or be called on another one with a NULL context since it will be the consequence of previous successful takeover, and will be ignored. Note that the extra cost of a lock and tree access here have a low overhead which is totally amortized given that these ones roughly happen 1-2 times per connection at best. While it was possible to crash the process after 10-100k req using H2 and a hand-refined configuration achieving perfect synchronism between a long (20+) chain of proxies and a short timeout (1ms), now with that fix this never happens even after 10M requests. Many thanks to Olivier for proposing this solution and explaining why it works. This should be backported as far as 2.2 (when inter-thread takeover was introduced). The code in older versions will be found in conn_fd_handler(). A workaround consists in disabling inter-thread pool sharing using: tune.idle-pool.shared off	2021-07-30 08:34:38 +02:00
Remi Tricot-Le Breton	4a6328f066	MEDIUM: connection: Add option to disable legacy error log In case of connection failure, a dedicated error message is output, following the format described in section "Error log format" of the documentation. These messages cannot be configured through a log-format option. This patch adds a new option, "dontloglegacyconnerr", that disables those error logs when set, and "replaces" them by a regular log line that follows the configured log-format (thanks to a call to sess_log in session_kill_embryonic). The new fc_conn_err sample fetch allows to add the legacy error log information into a regular log format. This new option is unset by default so the logging logic will remain the same until this new option is used.	2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton	98b930d043	MINOR: ssl: Define a default https log format This patch adds a new httpslog option and a new HTTP over SSL log-format that expands the default HTTP format and adds SSL specific information.	2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton	7c6898ee49	MINOR: ssl: Add new ssl_fc_hsk_err sample fetch This new sample fetch along the ssl_fc_hsk_err_str fetch contain the last SSL error of the error stack that occurred during the SSL handshake (from the frontend's perspective). The errors happening during the client's certificate verification will still be given by the ssl_c_err and ssl_c_ca_err fetches. This new fetch will only hold errors retrieved by the OpenSSL ERR_get_error function.	2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton	3d2093af9b	MINOR: connection: Add a connection error code sample fetch The fc_conn_err and fc_conn_err_str sample fetches give information about the problem that made the connection fail. This information would previously only have been given by the error log messages meaning that thanks to these fetches, the error log can now be included in a custom log format. The log strings were all found in the conn_err_code_str function.	2021-07-29 15:40:45 +02:00
Remi Tricot-Le Breton	0aa4130d65	BUG/MINOR: connection: Add missing error labels to conn_err_code_str The CO_ER_SSL_EARLY_FAILED and CO_ER_CIP_TIMEOUT connection error codes were missing in the conn_err_code_str switch which converts the error codes into string. This patch can be backported on all stable branches.	2021-07-29 15:40:45 +02:00
William Lallemand	6bb77b9c64	MINOR: proxy: rename PR_CAP_LUA to PR_CAP_INT This patch renames the proxy capability "LUA" to "INT" so it could be used for any internal proxy. Every proxy that are not user defined should use this flag.	2021-07-28 15:51:42 +02:00
David CARLIER	534197c721	BUILD/MINOR: memprof fix macOs build. this platform has a similar malloc_usable_size too.	2021-07-21 10:22:48 +02:00
Willy Tarreau	dc70c18ddc	BUG/MEDIUM: cfgcond: limit recursion level in the condition expression parser Oss-fuzz reports in issue 36328 that we can recurse too far by passing extremely deep expressions to the ".if" parser. I thought we were still limited to the 1024 chars per line, that would be highly sufficient, but we don't have any limit now :-/ Let's just pass a maximum recursion counter to the recursive parsers. It's decremented for each call and the expression fails if it reaches zero. On the most complex paths it can add 3 levels per parenthesis, so with a limit of 1024, that's roughly 343 nested sub-expressions that are supported in the worst case. That's more than sufficient, for just a few kB of RAM. No backport is needed.	2021-07-20 18:03:08 +02:00
Willy Tarreau	252412316e	MEDIUM: proxy: remove long-broken 'option http_proxy' This option had always been broken in HTX, which means that the first breakage appeared in 1.9, that it was broken by default in 2.0 and that no workaround existed starting with 2.1. The way this option works is praticularly unfit to the rest of the configuration and to the internal architecture. It had some uses when it was introduced 14 years ago but nowadays it's possible to do much better and more reliable using a set of "http-request set-dst" and "http-request set-uri" rules, which additionally are compatible with DNS resolution (via do-resolve) and are not exclusive to normal load balancing. The "option-http_proxy" example config file was updated to reflect this. The option is still parsed so that an error message gives hints about what to look for.	2021-07-18 19:35:32 +02:00
Willy Tarreau	f1db20c473	BUG/MINOR: cfgcond: revisit the condition freeing mechanism to avoid a leak The cfg_free_cond_{term,and,expr}() functions used to take a pointer to the pointer to be freed in order to replace it with a NULL once done. But this doesn't cope well with freeing lists as it would require recursion which the current code tried to avoid. Let's just change the API to free the area and let the caller set the NULL. This leak was reported by oss-fuzz (issue 36265).	2021-07-17 18:46:30 +02:00
Willy Tarreau	316ea7ede5	MINOR: cfgcond: support terms made of parenthesis around expressions Now it's possible to form a term using parenthesis around an expression. This will soon allow to build more complex expressions. For now they're still pretty limited but parenthesis do work.	2021-07-16 19:18:41 +02:00
Willy Tarreau	ca81887599	MINOR: cfgcond: insert an expression between the condition and the term Now evaluating a condition will rely on an expression (or an empty string), and this expression will support ORing a sub-expression with another optional expression. The sub-expressions ANDs a term with another optional sub-expression. With this alone precedence between && and \|\| is respected, and the following expression: A && B && C \|\| D \|\| E && F \|\| G will naturally evaluate as: (A && B && C) \|\| D \|\| (E && F) \|\| G	2021-07-16 19:18:41 +02:00
Willy Tarreau	087b2d018f	MINOR: cfgcond: make the conditional term parser automatically allocate nodes It's not convenient to let the caller be responsible for node allocation, better have the leaf function do that and implement the accompanying free call. Now only a pointer is needed instead of a struct, and the leaf function makes sure to leave the situation in a consistent way.	2021-07-16 19:18:41 +02:00
Willy Tarreau	ca56d3d28b	MINOR: cfgcond: support negating conditional expressions Now preceeding a config condition term with "!" will simply negate it. Example: .if !feature(OPENSSL) .alert "SSL support is mandatory" .endif	2021-07-16 19:18:41 +02:00
Willy Tarreau	f869095df9	MINOR: cfgcond: start to split the condition parser to introduce terms The purpose is to build a descendent parser that will split conditions into expressions made of terms. There are two phases, a parsing phase and an evaluation phase. Strictly speaking it's not required to cut that in two right now, but it's likely that in the future we won't want certain predicates to be evaluated during the parsing (e.g. file system checks or execution of some external commands). The cfg_eval_condition() function is now much simpler, it just tries to parse a single term, and if OK evaluates it, then returns the result. Errors are unchanged and may still be reported during parsing or evaluation. It's worth noting that some invalid expressions such as streq(a,b)zzz continue to parse correctly for now (what remains after the parenthesis is simply ignored as not necessary).	2021-07-16 19:18:41 +02:00
Willy Tarreau	66243b4273	REORG: config: move the condition preprocessing code to its own file The .if/.else/.endif and condition evaluation code is quite dirty and was dumped into cfgparse.c because it was easy. But it should be tidied quite a bit as it will need to evolve. Let's move all that to cfgcond.{c,h}.	2021-07-16 19:18:41 +02:00
Willy Tarreau	ab213a5b6f	MINOR: arg: add a free_args() function to free an args array make_arg_list() can create an array of arguments, some of which remain to be resolved, but all users had to deal with their own roll back on error. Let's add a free_args() function to release all the array's elements and let the caller deal with the array itself (sometimes it's allocated in the stack).	2021-07-16 19:18:41 +02:00
Amaury Denoyelle	669b620e5f	MINOR: srv: extract tracking server config function Extract the post-config tracking setup in a dedicated function srv_apply_track. This will be useful to implement track support for dynamic servers.	2021-07-16 10:08:55 +02:00
Willy Tarreau	4c6986a6bc	CLEANUP: applet: remove unused thread_mask Since 1.9 with commit `673867c35` ("MAJOR: applets: Use tasks, instead of rolling our own scheduler.") the thread_mask field of the appctx became unused, but the code hadn't been cleaned for this. The appctx has its own task and the task's thread_mask is the one to be displayed. It's worth noting that all calls to appctx_new() pass tid_bit as the thread_mask. This makes sense, and it could be convenient to decide that this becomes the norm and to simplify the API.	2021-07-13 18:20:34 +02:00
Amaury Denoyelle	befeae88e8	MINOR: mux_h2: define config to disable h2 websocket support Define a new global config statement named "h2-workaround-bogus-websocket-clients". This statement will disable the automatic announce of h2 websocket support as specified in the RFC8441. This can be use to overcome clients which fail to implement the relatively fresh RFC8441. Clients will in his case automatically downgrade to http/1.1 for the websocket tunnel if the haproxy configuration allows it. This feature is relatively simple and can be backported up to 2.4, which saw the introduction of h2 websocket support.	2021-07-12 10:41:45 +02:00
Amaury Denoyelle	c453f9547e	MINOR: http: use http uri parser for path Replace http_get_path by the http_uri_parser API. The new functions is renamed http_parse_path. Replace duplicated code for scheme and authority parsing by invocations to http_parse_scheme/authority. If no scheme is found for an URI detected as an absolute-uri/authority, consider it to be an authority format : no path will be found. For an absolute-uri or absolute-path, use the remaining of the string as the path. A new http_uri_parser state is declared to mark the path parsing as done.	2021-07-08 17:11:17 +02:00
Amaury Denoyelle	69294b20ac	MINOR: http: use http uri parser for authority Replace http_get_authority by the http_uri_parser API. The new function is renamed http_parse_authority. Replace duplicated scheme parsing code by http_parse_scheme invocation. A new http_uri_parser state is declared to mark the authority parsing as done.	2021-07-08 17:11:17 +02:00
Amaury Denoyelle	8ac8cbfd72	MINOR: http: use http uri parser for scheme Replace http_get_scheme by the http_uri_parser API. The new function is renamed http_parse_scheme. A new http_uri_parser state is declared to mark the scheme parsing as completed.	2021-07-08 17:11:17 +02:00
Amaury Denoyelle	89c68c8117	MINOR: http: implement http uri parser Implement a http uri parser type. This type will be used as a context to parse the various elements of an uri. The goal of this serie of patches is to factorize duplicated code between the http_get_scheme/authority/path functions. A simple parsing API is designed to be able to extract once each element of an HTTP URI in order. The functions will be renamed in the following patches to reflect the API change with the prefix http_parse_*. For the parser API, the http_uri_parser type must first be initialized before usage. It will register the URI to parse and detect its format according to the rfc 7230.	2021-07-08 17:08:57 +02:00
Amaury Denoyelle	4c0882b1b4	MEDIUM: http: implement scheme-based normalization Implement the scheme-based uri normalization as described in rfc3986 6.3.2. Its purpose is to remove the port of an uri if the default one is used according to the uri scheme : 80/http and 443/https. All other ports are not touched. This method uses an htx message as an input. It requires that the target URI is in absolute-form with a http/https scheme. This represents most of h2 requests except CONNECT. On the contrary, most of h1 requests won't be elligible as origin-form is the standard case. The normalization is first applied on the target URL of the start line. Then, it is conducted on every Host headers present, assuming that they are equivalent to the target URL. This change will be notably useful to not confuse users who are accustomed to use the host for routing without specifying default ports. This problem was recently encountered with Firefox which specify the 443 default port for http2 websocket Extended CONNECT.	2021-07-07 15:34:01 +02:00
Amaury Denoyelle	ef08811240	MINOR: http: implement http_get_scheme This method can be used to retrieve the scheme part of an uri, with the suffix '://'. It will be useful to implement scheme-based normalization.	2021-07-07 15:34:01 +02:00
Emeric Brun	4d7ada8f9e	MEDIUM: stick-table: add the new arrays of gpc and gpc_rate This patch adds the definition of two new array data_types: 'gpc': This is an array of 32bits General Purpose Counters. 'gpc_rate': This is an array on increment rates of General Purpose Counters. Like for all arrays, they are limited to 100 elements. This patch also adds actions and fetches to handle elements of those arrays. Note: As documented, those new actions and fetches won't apply to the legacy 'gpc0', 'gpc1', 'gpc0_rate' nor 'gpc1_rate'.	2021-07-06 07:24:42 +02:00
Emeric Brun	877b0b5a7b	MEDIUM: stick-table: add the new array of gpt data_type This patch adds the definition of a new array data_type 'gpt'. This is an array of 32bits General Purpose Tags. Like for all arrays, it is limited to 100 elements. This patch also adds actions and fetches to handle elements of this array. Note: As documented, those new actions and fetches won't apply to the legacy 'gpt0' data type.	2021-07-06 07:24:42 +02:00
Emeric Brun	90a9b676a8	MEDIUM: peers: handle arrays of std types in peers protocol This patch adds support of array data_types on the peer protocol. The table definition message will provide an additionnal parameter for array data-types: the number of elements of the array. In case of array of frqp it also provides a second parameter: the period used to compute freq counter. The array elements are std_type values linearly encoded in the update message. Note: if a remote peer announces an array data_type without parameters into the table definition message, all updates on this table will be ignored because we can not parse update messages consistently.	2021-07-06 07:24:42 +02:00
Emeric Brun	c64a2a307c	MEDIUM: stick-table: handle arrays of standard types into stick-tables This patch provides the code to handle arrays of some standard types (SINT, UINT, ULL and FRQP) in stick table. This way we could define new "array" data types. Note: the number of elements of an array was limited to 100 to put a limit and to ensure that an encoded update message will continue to fit into a buffer when the peer protocol will handle such data types.	2021-07-06 07:24:42 +02:00
Emeric Brun	0e3457b63a	MINOR: stick-table: make skttable_data_cast to use only std types This patch replaces all advanced data type aliases on stktable_data_cast calls by standard types. This way we could call the same stktable_data_cast regardless of the used advanced data type as long they are using the same std type. It also removes all the advanced data type aliases.	2021-07-06 07:24:42 +02:00
David Carlier	bae4cb2790	BUILD/MEDIUM: tcp: set-mark support for OpenBSD set-mark support for this platform, for routing table purpose. Follow-up from `f7f53afcf9`, this time for OpenBSD.	2021-07-05 10:53:18 +02:00
David Carlier	f7f53afcf9	BUILD/MEDIUM: tcp: set-mark setting support for FreeBSD. This platform has a similar socket option from Linux's SO_MARK, marking a socket with an id for packet filter purpose, DTrace monitoring and so on.	2021-06-28 07:03:35 +02:00
Christopher Faulet	469c06c30e	MINOR: http-act/tcp-act: Add "set-mark" and "set-tos" for tcp content rules It is now possible to set the Netfilter MARK and the TOS field value in all packets sent to the client from any tcp-request rulesets or the "tcp-response content" one. To do so, the parsing of "set-mark" and "set-tos" actions are moved in tcp_act.c and the actions evaluation is handled in dedicated functions. This patch may be backported as far as 2.2 if necessary.	2021-06-25 16:11:58 +02:00
Christopher Faulet	1da374af2f	MINOR: http-act/tcp-act: Add "set-nice" for tcp content rules It is now possible to set the "nice" factor of the current stream from a "tcp-request content" or "tcp-response content" ruleset. To do so, the action parsing is moved in stream.c and the action evaluation is handled in a dedicated function. This patch may be backported as far as 2.2 if necessary.	2021-06-25 16:11:53 +02:00
Christopher Faulet	551a641cff	MINOR: http-act/tcp-act: Add "set-log-level" for tcp content rules It is now possible to set the stream log level from a "tcp-request content" or "tcp-response content" ruleset. To do so, the action parsing is moved in stream.c and the action evaluation is handled in a dedicated function. This patch should fix issue #1306. It may be backported as far as 2.2 if necessary.	2021-06-25 16:11:46 +02:00
Willy Tarreau	51c63f0f0a	MINOR: queue: remove the px/srv fields from pendconn Now we directly use p->queue to get to the queue, which is much more straightforward. The performance on 100 servers and 16 threads increased from 560k to 574k RPS, or 2.5%. A lot more simplifications are possible, but the minimum was done at this point.	2021-06-24 10:52:31 +02:00
Willy Tarreau	8429097c61	MINOR: queue: store a pointer to the queue into the pendconn By following the queue pointer in the pendconn it will now be possible to always retrieve the elements (index, srv, px, etc).	2021-06-24 10:52:31 +02:00
Willy Tarreau	cdc83e0192	MINOR: queue: add a pointer to the server and the proxy in the queue A queue is specific to a server or a proxy, so we don't need to place this distinction inside all pendconns, it can be in the queue itself. This commit adds the relevant fields "px" and "sv" into the struct queue, and initializes them accordingly.	2021-06-24 10:52:31 +02:00
Willy Tarreau	df3b0cbe31	MINOR: queue: add queue_init() to initialize a queue This is better and cleaner than open-coding this in the server and proxy code, where it has all chances of becoming wrong once forgotten.	2021-06-24 10:52:31 +02:00
Willy Tarreau	9ab78293bf	MEDIUM: queue: simplify again the process_srv_queue() API (v2) This basically undoes the API changes that were performed by commit `0274286dd` ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-24 10:52:31 +02:00
Willy Tarreau	16fbdda3c3	MEDIUM: queue: use a dedicated lock for the queues (v2) Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 362k to 374k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 9%. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-24 10:52:31 +02:00
Willy Tarreau	3f70fb9ea2	Revert "MEDIUM: queue: use a dedicated lock for the queues" This reverts commit `fcb8bf8650`. The recent changes since `5304669e1` MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn opened a tiny race condition between stream_free() and process_srv_queue(), as the pendconn is accessed outside of the lock, possibly while it's being freed. A different approach is required.	2021-06-24 07:26:28 +02:00
Willy Tarreau	ccd85a3e08	Revert "MEDIUM: queue: simplify again the process_srv_queue() API" This reverts commit `c83e45e9b0`. The recent changes since `5304669e1` MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn opened a tiny race condition between stream_free() and process_srv_queue(), as the pendconn is accessed outside of the lock, possibly while it's being freed. A different approach is required.	2021-06-24 07:22:18 +02:00
Willy Tarreau	c83e45e9b0	MEDIUM: queue: simplify again the process_srv_queue() API This basically undoes the API changes that were performed by commit `0274286dd` ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-22 18:57:15 +02:00
Willy Tarreau	fcb8bf8650	MEDIUM: queue: use a dedicated lock for the queues Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 491k to 507k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 6%. The performance profile changes from this: 13.03% haproxy [.] fwlc_srv_reposition 8.08% haproxy [.] fwlc_get_next_server 3.62% haproxy [.] process_srv_queue 1.78% haproxy [.] pendconn_dequeue 1.74% haproxy [.] pendconn_add to this: 11.95% haproxy [.] fwlc_srv_reposition 7.57% haproxy [.] fwlc_get_next_server 3.51% haproxy [.] process_srv_queue 1.74% haproxy [.] pendconn_dequeue 1.70% haproxy [.] pendconn_add At this point the differences are mostly measurement noise. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-22 18:43:56 +02:00
Willy Tarreau	a05704582c	MINOR: server: replace the pendconns-related stuff with a struct queue Just like for proxies, all three elements (pendconns, nbpend, queue_idx) were moved to struct queue.	2021-06-22 18:43:14 +02:00
Willy Tarreau	7f3c1df248	MINOR: proxy: replace the pendconns-related stuff with a struct queue All three elements (pendconns, nbpend, queue_idx) were moved to struct queue.	2021-06-22 18:43:14 +02:00
Willy Tarreau	eea3817a47	MINOR: queue: create a new structure type "queue" This structure will be common to proxies and servers and will contain everything needed to handle their respective queues. For now it's only a tree head, a length and an index.	2021-06-22 18:43:14 +02:00
Willy Tarreau	5941ef0a6c	MINOR: lb/api: remove the locked argument from take_conn/drop_conn This essentially reverts commit 2b4370078 ("MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock") that was merged during 2.4 before the various locks could be eliminated at the lower layers. Passing that information complicates the cleanup of the queuing code and it's become useless.	2021-06-22 18:43:12 +02:00
Amaury Denoyelle	0274286dd3	BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check The server_parse_maxconn_change_request locks the server lock. However, this function can be called via agent-checks or lua code which already lock it. This bug has been introduced by the following commit : commit `79a88ba3d0` BUG/MAJOR: server: prevent deadlock when using 'set maxconn server' This commit tried to fix another deadlock with can occur because previoulsy server_parse_maxconn_change_request requires the server lock to be held. However, it may call internally process_srv_queue which also locks the server lock. The locking policy has thus been updated. The fix is functional for the CLI 'set maxconn' but fails to address the agent-check / lua counterparts. This new issue is fixed in two steps : - changes from the above commit have been reverted. This means that server_parse_maxconn_change_request must again be called with the server lock. - to counter the deadlock fixed by the above commit, process_srv_queue now takes an argument to render the server locking optional if the caller already held it. This is only used by server_parse_maxconn_change_request. The above commit was subject to backport up to 1.8. Thus this commit must be backported in every release where it is already present.	2021-06-22 11:39:20 +02:00
Tim Duesterhus	7386668cbf	CLEANUP: Prevent channel-t.h from being detected as C++ by GitHub GitHub uses github/linguist to determine the programming language used for each source file to show statistics and to power the search. In cases of unique file extensions this is easy, but for `.h` files the situation is less clear as they are used for C, C++, Objective C and more. In these cases linguist makes use of heuristics to determine the language. One of these heuristics for C++ is that the file contains a line beginning with `try`, only preceded by whitespace indentation. This heuristic matches the long comment at the bottom of `channel-t.h`, as one sentence includes the word `try` after a linebreak. Fix this misdetection by changing the comment to follow the convention that all lines start with an asterisk.	2021-06-20 11:46:26 +02:00
Amaury Denoyelle	36aa451a4e	MINOR: ssl: render file-access optional on server crt loading The function ssl_sock_load_srv_cert will be used at runtime for dynamic servers. If the cert is not loaded on ckch tree, we try to access it from the file-system. Now this access operation is rendered optional by a new function argument. It is only allowed at parsing time, but will be disabled for dynamic servers at runtime.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	c593bcdb43	MINOR: ssl: always initialize random generator Explicitly call ssl_initialize_random to initialize the random generator in init() global function. If the initialization fails, the startup is interrupted. This commit is in preparation for support of ssl on dynamic servers. To be able to activate ssl on dynamic servers, it is necessary to ensure that the random generator is initialized on startup regardless of the config. It cannot be called at runtime as access to /dev/urandom is required. This also has the effect to fix the previous non-consistent behavior. Indeed, if bind or server in the config are using ssl, the initialization function was called, and if it failed, the startup was interrupted. Otherwise, the ssl initialization code could have been called through the ssl server for lua, but this times without blocking the startup on error. Or not called at all if lua was deactivated.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	2b1d91758d	BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose The SF_SRV_REUSED flag was set if a stream reused a backend connection. One of its purpose is to count the total reuse on the backend in opposition to newly instantiated connection. However, the flag was diverted from its original purpose since the following commit : `e8f5f5d8b2` BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready. With this change, the flag is not set anymore if the mux is not ready when a connection is picked for reuse. This can happen for multiplexed connections which are inserted in the available list as soon as created in http-reuse always mode. The goal of this change is to not retry immediately this request in case on an error on the same server if the reused connection is not fully ready. This change is justified for the retry timeout handling but it breaks other places which still uses the flag for its original purpose. Mainly, in this case the wrong 'connect' backend counter is incremented instead of the 'reuse' one. The flag is also used in http_return_srv_error and may have an impact if a http server error is replied for this stream. To fix this problem, the original purpose of the flag is restored by setting it unconditionaly when a connection is reused. Additionally, a new flag SF_SRV_REUSED_ANTICIPATED is created. This flag is set when the connection is reused but the mux is not ready yet. For the timeout handling on error, the request is retried immediately only if the stream reused a connection without this newly anticipated flag. This must be backported up to 2.1.	2021-06-17 17:58:50 +02:00
Christopher Faulet	dcac418062	BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status When a server relies on a SRV resolution, a task is created to clean it up (fqdn/port and address) when the SRV resolution is considered as outdated (based on the resolvers 'timeout' value). It is only possible if the server inherits outdated info from a state file and is no longer selected to be attached to a SRV item. Note that most of time, a server is attached to a SRV item. Thus when the item becomes obsolete, the server is cleaned up. It is important to have such task to be sure the server will be free again to have a chance to be resolved again with fresh information. Of course, this patch is a workaround to solve a design issue. But there is no other obvious way to fix it without rewritting all the resolvers part. And it must be backportable. This patch relies on following commits: * MINOR: resolvers: Clean server in a dedicated function when removing a SRV item * MINOR: resolvers: Remove server from named_servers tree when removing a SRV item All the series must be backported as far as 2.2 after some observation period. Backports to 2.0 and 1.8 must be evaluated.	2021-06-17 16:52:35 +02:00
Miroslav Zagorac	8a8f270f6a	CLEANUP: server: a separate function for initializing the per_thr field To avoid repeating the same source code, allocating memory and initializing the per_thr field from the server structure is transferred to a separate function.	2021-06-17 16:07:10 +02:00
Willy Tarreau	d943a044aa	MINOR: connection: add helper conn_append_debug_info() This function appends to a buffer some information from a connection. This will be used by traces and possibly some debugging as well. A frontend/backend/server, transport/control layers, source/destination ip:port, connection pointer and direction are reported depending on the available information.	2021-06-16 18:30:42 +02:00
Willy Tarreau	6fd0450b47	CLEANUP: shctx: remove the different inter-process locking techniques With a single process, we don't need to USE_PRIVATE_CACHE, USE_FUTEX nor USE_PTHREAD_PSHARED anymore. Let's only keep the basic spinlock to lock between threads.	2021-06-15 16:52:42 +02:00
Willy Tarreau	e8422bf56b	MEDIUM: global: remove the relative_pid from global and mworker The relative_pid is always 1. In mworker mode we also have a child->relative_pid which is always equalt relative_pid, except for a master (0) or external process (-1), but these types are usually tested for, except for one place that was amended to carefully check for the PROC_O_TYPE_WORKER option. Changes were pretty limited as most usages of relative_pid were for designating a process in stats output and peers protocol.	2021-06-15 16:52:42 +02:00
Willy Tarreau	06987f4238	CLEANUP: global: remove unused definition of MAX_PROCS This one was forced to 1 and the only reference was a test to verify it was comprised between 1 and LONGBITS.	2021-06-15 16:52:42 +02:00
Willy Tarreau	44ea631b77	MEDIUM: cpu-set: make the proc a single bit field and not an array We only have a single process now so we don't need to store the per-proc CPU binding anymore.	2021-06-15 16:52:42 +02:00
Willy Tarreau	72faef3866	MEDIUM: global: remove dead code from nbproc/bind_proc removal Lots of places iterating over nbproc or comparing with nbproc could be simplified. Further, "bind-process" and "process" parsing that was already limited to process 1 or "all" or "odd" resulted in a bind_proc field that was either 0 or 1 during the init phase and later always 1. All the checks for compatibilities were removed since it's not possible anymore to run a frontend and a backend on different processes or to have peers and stick-tables bound on different ones. This is the largest part of this patch. The bind_proc field was removed from both the proxy and the receiver structs. Since the "process" and "bind-process" directives are still parsed, configs making use of correct values allowing process 1 will continue to work.	2021-06-15 16:52:42 +02:00
Willy Tarreau	5301f5d72a	CLEANUP: global: remove pid_bit and all_proc_mask They were already set to 1 and never changed. Let's remove them and replace their references with 1.	2021-06-15 16:52:42 +02:00
Willy Tarreau	91358595f8	CLEANUP: global: remove the nbproc field from the global structure Let's use 1 in the rare places where it was still referenced since it's now its only possible value.	2021-06-15 16:52:42 +02:00
Willy Tarreau	9c6a80231f	CLEANUP: global: remove unused definition of stopping_task[] This is a leftover of a previous attempt that was introduced in 2.4 by commit `d3a88c1c3` ("MEDIUM: connection: close front idling connection on soft-stop"). It can be backported, as the variable doesn't exist.	2021-06-15 16:52:42 +02:00
Willy Tarreau	9e467af804	BUG/MEDIUM: shctx: use at least thread-based locking on USE_PRIVATE_CACHE Since threads were introduced in 1.8, the USE_PRIVATE_CACHE mode of the shctx was not updated to use locks. Originally it was meant to disable sharing between processes, so it removes the lock/unlock instructions. But with threads enabled, it's not possible to work like this anymore. It's easy to see that once built with private cache and threads enabled, sending violent SSL traffic to the the process instantly makes it die. The HTTP cache is very likely affected as well. This patch addresses this by falling back to our native spinlocks when USE_PRIVATE_CACHE is used. In practice we could use them also for other modes and remove all older implementations, but this patch aims at keeping the changes very low and easy to backport. A new SHCTX_LOCK label was added to help with debugging, but OTHER_LOCK might be usable as well for backports. An even lighter approach for backports may consist in always declaring the lock (or reusing "waiters"), and calling pl_take_s() for the lock() and pl_drop_s() for the unlock() operation. This could even be used in all modes (process and threads), even when thread support is disabled. Subsequent patches will further clean up this area. This patch must be backported to all supported versions since 1.8.	2021-06-15 16:52:07 +02:00
Remi Tricot-Le Breton	6916493c29	MINOR: ssl: Use OpenSSL's ASN1_TIME convertor when available The ASN1_TIME_to_tm function was added in OpenSSL1.1.1 so with this version of the library we do not need our homemade time convertor anymore.	2021-06-14 15:12:53 +02:00
Willy Tarreau	b63dbb7b2e	MAJOR: config: remove parsing of the global "nbproc" directive This one was deprecated in 2.3 and marked for removal in 2.5. It suffers too many limitations compared to threads, and prevents some improvements from being engaged. Instead of a bypassable startup error, there is now a hard error. The parsing code was removed, and very few obvious cases were as well. The code is deeply rooted at certain places (e.g. "for" loops iterating from 0 to nbproc) so it will not be that trivial to remove everywhere. The "bind" and "bind-process" parsers will have to be adjusted, though maybe not completely changed if we later want to support thread groups for large NUMA machines. Some stats socket restrictions were removed, and the doc was updated according to what was done. A few places in the doc still refer to nbproc and will have to be revisited. The master-worker code also refers to the process number to distinguish between master and workers and will have to be carefully adjusted. The MAX_PROCS macro was reset to 1, this will at least reduce the size of some remaining arrays. Two regtests were dependieng on this directive, one with an explicit "nbproc 1" and another one testing the master's CLI using nbproc 4. Both were adapted.	2021-06-11 17:02:13 +02:00
Willy Tarreau	eb778248d9	MEDIUM: proxy: remove the deprecated "grace" keyword Commit `ab0a5192a` ("MEDIUM: config: mark "grace" as deprecated") marked the "grace" keyword as deprecated in 2.3, tentative removal for 2.4 with a hard deadline in 2.5, so let's remove it and return an error now. This old and outdated feature was incompatible with soft-stop, reload and socket transfers, and keeping it forced ugly hacks in the lower layers of the protocol stack.	2021-06-11 16:57:34 +02:00
Emeric Brun	3406766d57	MEDIUM: resolvers: add a ref between servers and srv request or used SRV record This patch add a ref into servers to register them onto the record answer item used to set their hostnames. It also adds a head list into 'srvrq' to register servers free to be affected to a SRV record. A head of a tree is also added to srvrq to put servers which present a hotname in server state file. To re-link them fastly to the matching record as soon an item present the same name. This results in better performances on SRV record response parsing. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	bd78c912fd	MEDIUM: resolvers: add a ref on server to the used A/AAAA answer item This patch adds a head list into answer items on servers which use this record to set their IPs. It makes lookup on duplicated ip faster and allow to check immediatly if an item is still valid renewing the IP. This results in better performances on A/AAAA resolutions. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Christopher Faulet	1cf414b522	BUG/MAJOR: htx: Fix htx_defrag() when an HTX block is expanded When an HTX block is expanded, a defragmentation may be performed first to have enough space to copy the new data. When it happens, the meta data of the HTX message must take account of the new data length but copied data are still unchanged at this stage (because we need more space to update the message content). And here there is a bug because the meta data are updated by the caller. It means that when the blocks content is copied, the new length is already set. Thus a block larger than the reality is copied and data outside the buffer may be accessed, leading to a crash. To fix this bug, htx_defrag() is updated to use an extra argument with the new meta data to use for the referenced block. Thus the caller does not need to update the HTX message by itself. However, it still have to update the data. Most of time, the bug will be encountered in the HTTP compression filter. But, even if it is highly unlikely, in theory it is also possible to hit it when a HTTP header (or only its value) is replaced or when the start-line is changed. This patch must be backported as far as 2.0.	2021-06-11 14:05:34 +02:00
Willy Tarreau	c12bf9af0b	BUG/MEDIUM: errors: include missing obj_type file A tiny change in commit `6af81f80f` ("MEDIUM: errors: implement parsing context type") triggered an awful bug in gcc 5 and below (4.7.4 to 5.5 confirmed affected, at least on aarch64/mips/x86_64) causing the startup to loop forever in acl_find_target(). This was tracked down to the acl.c file seeing a different definition of the struct proxy than other files. The reason for this is that it sees an unpacked "enum obj_type" (4 bytes) while others see it packed (1 byte), thus all fields in the struct are having a different alignment, and the "acl" list is shifted one pointer to the next struct and seems to loop onto itself. The commit above did nothing more than adding "enum obj_type *obj" in a new struct without including obj_type.h, and that was apparently enough for the compiler to internally declare obj_type as a regular enum and silently ignore the packed attribute that it discovers later, so depending on the order of includes, some files would see it as 1 byte and others as 4. This patch simply adds the missing include but due to the nature of the bug, probably that creating a special "packed_enum" definition to disable the packed attribute on such compilers could be a safer option. No backport is needed as this is only in -dev.	2021-06-11 07:43:07 +02:00
Remi Tricot-Le Breton	3faf0cbba6	BUILD: ssl: Fix compilation with BoringSSL The ifdefs surrounding the "show ssl ocsp-response" functionality that were supposed to disable the code with BoringSSL were built the wrong way. It does not need to be backported.	2021-06-10 19:01:13 +02:00
Willy Tarreau	8715dec6f9	MEDIUM: pools: remove the locked pools implementation Now that the modified lockless variant does not need a DWCAS anymore, there's no reason to keep the much slower locked version, so let's just get rid of it.	2021-06-10 17:46:50 +02:00
Willy Tarreau	1526ffe815	CLEANUP: pools: remove now unused seq and pool_free_list These ones were only used by the lockless implementation and are not needed anymore.	2021-06-10 17:46:50 +02:00
Willy Tarreau	2a4523f6f4	BUG/MAJOR: pools: fix possible race with free() in the lockless variant In GH issue #1275, Fabiano Nunes Parente provided a nicely detailed report showing reproducible crashes under musl. Musl is one of the libs coming with a simple allocator for which we prefer to keep the shared cache. On x86 we have a DWCAS so the lockless implementation is enabled for such libraries. And this implementation has had a small race since day one: the allocator will need to read the first object's <next> pointer to place it into the free list's head. If another thread picks the same element and immediately releases it, while both the local and the shared pools are too crowded, it will be freed to the OS. If the libc's allocator immediately releases it, the memory area is unmapped and we can have a crash while trying to read that pointer. However there is no problem as long as the item remains mapped in memory because whatever value found there will not be placed into the head since the counter will have changed. The probability for this to happen is extremely low, but as analyzed by Fabiano, it increases with the buffer size. On 16 threads it's relatively easy to reproduce with 2MB buffers above 200k req/s, where it should happen within the first 20 seconds of traffic usually. This is a structural issue for which there are two non-trivial solutions: - place a read lock in the alloc call and a barrier made of lock/unlock in the free() call to force to serialize operations; this will have a big performance impact since free() is already one of the contention points; - change the allocator to use a self-locked head, similar to what is done in the MT_LISTS. This requires two memory writes to the head instead of a single one, thus the overhead is exactly one memory write during alloc and one during free; This patch implements the second option. A new POOL_DUMMY pointer was defined for the locked pointer value, allowing to both read and lock it with a single xchg call. The code was carefully optimized so that the locked period remains the shortest possible and that bus writes are avoided as much as possible whenever the lock is held. Tests show that while a bit slower than the original lockless implementation on large buffers (2MB), it's 2.6 times faster than both the no-cache and the locked implementation on such large buffers, and remains as fast or faster than the all implementations when buffers are 48k or higher. Tests were also run on arm64 with similar results. Note that this code is not used on modern libcs featuring a fast allocator. A nice benefit of this change is that since it removes a dependency on the DWCAS, it will be possible to remove the locked implementation and replace it with this one, that is then usable on all systems, thus significantly increasing their performance with large buffers. Given that lockless pools were introduced in 1.9 (not supported anymore), this patch will have to be backported as far as 2.0. The code changed several times in this area and is subject to many ifdefs which will complicate the backport. What is important is to remove all the DWCAS code from the shared cache alloc/free lockless code and replace it with this one. The pool_flush() code is basically the same code as the allocator, retrieving the whole list at once. If in doubt regarding what barriers to use in older versions, it's safe to use the generic ones. This patch depends on the following previous commits: - MINOR: pools: do not maintain the lock during pool_flush() - MINOR: pools: call malloc_trim() under thread isolation - MEDIUM: pools: use a single pool_gc() function for locked and lockless The last one also removes one occurrence of an unneeded DWCAS in the code that was incompatible with this fix. The removal of the now unused seq field will happen in a future patch. Many thanks to Fabiano for his detailed report, and to Olivier for his help on this issue.	2021-06-10 17:46:50 +02:00
Willy Tarreau	9a7aa3b4a1	BUG/MINOR: pools: make DEBUG_UAF always write to the to-be-freed location Since the code was reorganized, DEBUG_UAF was still tested in the locked pool code despite pools being disabled when DEBUG_UAF is used. Let's move the test to pool_put_to_os() which is the one that is always called in this condition. The impact is only a possible misleading analysis during a troubleshooting session due to a missing double-frees or free of const area test that is normally already dealt with by the underlying code anyway. In practice it's unlikely anyone will ever notice. This should only be backported to 2.4.	2021-06-10 17:46:50 +02:00
Remi Tricot-Le Breton	d92fd11c77	MINOR: ssl: Add new "show ssl ocsp-response" CLI command This patch adds the "show ssl ocsp-response [<id>]" CLI command. This command can be used to display the IDs of the OCSP tree entries along with details about the entries' certificate ID (issuer's name and key hash + serial number), or to display the details of a single ocsp-response if an ID is given. The details displayed in this latter case are the ones shown by a "openssl ocsp -respin <ocsp-response> -text" call.	2021-06-10 16:44:11 +02:00
William Lallemand	722180aca8	BUILD: make tune.ssl.keylog available again Since commit `04a5a44` ("BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions") the "tune.ssl.keylog" feature is broken because HAVE_OPENSSL_KEYLOG does not exist. Replace this by a HAVE_SSL_KEYLOG which is defined in openssl-compat.h. Also add an error when not built with the right openssl version. Must be backported as far as 2.3.	2021-06-09 17:10:13 +02:00
Amaury Denoyelle	846830e47d	BUG: errors: remove printf positional args for user messages context Change the algorithm for the generation of the user messages context prefix. Remove the dubious API relying on optional printf positional arguments. This may be non portable, and in fact the CI glibc crashes with the following error when some arguments are not present in the format string : "invalid %N$ use detected". Now, a fixed buffer attached to the context instance is allocated once for the program lifetime. Then call repeatedly snprintf with the optional arguments of context if present to build the context string. The buffer is deallocated via a per-thread free handler. This does not need to be backported.	2021-06-08 11:40:44 +02:00
Maximilian Mader	fc0cceb08a	MINOR: haproxy: Add `-cc` argument This patch adds the `-cc` (check condition) argument to evaluate conditions on startup and return the result as the exit code. As an example this can be used to easily check HAProxy's version in scripts: haproxy -cc 'version_atleast(2.4)' This resolves GitHub issue #1246. Co-authored-by: Tim Duesterhus <tim@bastelstu.be>	2021-06-08 11:17:19 +02:00
Maximilian Mader	29c6cd7d8a	CLEANUP: tools: Make errptr const in `parse_line()` This change is for consistency with `cfg_eval_condition()`.	2021-06-08 10:56:10 +02:00
Amaury Denoyelle	6af81f80fb	MEDIUM: errors: implement parsing context type Create a parsing_ctx structure. This type is used to store information about the current file/line parsed. A global context is created and can be manipulated when haproxy is in STARTING mode. When starting is over, the context is resetted and should not be accessed anymore.	2021-06-07 16:58:16 +02:00
Amaury Denoyelle	1833e43c3e	MEDIUM: errors: implement user messages buffer The user messages buffer is used to store the stderr output after the starting is over. Each thread has it own user messages buffer. Add some functions to add a new message, retrieve and clear the content. The user messages buffer primary goal is to be consulted by CLI handlers. Each handlers using it must clear the buffer before starting its operation.	2021-06-07 16:58:16 +02:00
Amaury Denoyelle	ce986e1ce8	REORG: errors: split errors reporting function from log.c Move functions related to errors output on stderr from log.c to a newly created errors.c file. It targets print_message and ha_alert/warning/notice/diag functions and related startup_logs feature.	2021-06-07 16:58:15 +02:00
Amaury Denoyelle	01b3c3d4fb	MINOR: errors: allow empty va_args for diag variadic macro Use the '##' operator to allow the usage of HA_DIAG_WARNING_COND macro without extra arguments. This must be backported up to 2.4.	2021-06-07 16:58:15 +02:00
Remi Tricot-Le Breton	476462010e	BUG/MINOR: proxy: Missing calloc return value check in chash_init_server_tree A memory allocation failure happening in chash_init_server_tree while trying to allocate a server's lb_nodes item used in consistent hashing would have resulted in a crash. This function is only called during configuration parsing. It was raised in GitHub issue #1233. It could be backported to all stable branches.	2021-05-31 10:55:51 +02:00
Remi Tricot-Le Breton	1f4fa906c7	BUG/MINOR: worker: Missing calloc return value check in mworker_env_to_proc_list A memory allocation failure happening in mworker_env_to_proc_list when trying to allocate a mworker_proc would have resulted in a crash. This function is only called during init. It was raised in GitHub issue #1233. It could be backported to all stable branches.	2021-05-31 10:51:06 +02:00
Remi Tricot-Le Breton	208ff01b23	BUG/MINOR: peers: Missing calloc return value check in peers_register_table A memory allocation failure happening during peers_register_table would have resulted in a crash. This function is only called during init. It was raised in GitHub issue #1233. It could be backported to all stable branches.	2021-05-31 10:50:46 +02:00
Remi Tricot-Le Breton	f1800e64ef	BUG/MINOR: server: Missing calloc return value check in srv_parse_source Two calloc calls were not checked in the srv_parse_source function. Considering that this function could be called at runtime through a dynamic server creation via the CLI, this could lead to an unfortunate crash. It was raised in GitHub issue #1233. It could be backported to all stable branches even though the runtime crash could only happen on branches where dynamic server creation is possible.	2021-05-31 10:50:32 +02:00
Christopher Faulet	4fc51a73e6	MINOR: buf: Add function to realign a buffer with a specific head position b_slow_realign() function may be used to realign a buffer with a given amount of output data, eventually 0. In such case, the head is set to 0. This function is not designed to be used with input only buffers, like those used in the muxes. It is the purpose of b_slow_realign_ofs() function. It does almost the same, realign a buffer. But it do so by setting the buffer head to a specific offset.	2021-05-25 10:41:50 +02:00
Christopher Faulet	de471a4a8d	MINOR: h1-htx: Update h1 parsing functions to return result as a size_t h1 parsing functions (h1_parse_msg_*) returns the number of bytes parsed or 0 if nothing is parsed because an error occurred or some data are missing. But they never return negative values. Thus, instead of a signed integer, these function now return a size_t value. The H1 and FCGI muxes are updated accordingly. Note that h1_parse_msg_data() has been slightly adapted because the parsing of chunked messages still need to handle negative values when a parsing error is reported by h1_parse_chunk_size() or h1_skip_chunk_crlf().	2021-05-25 10:41:50 +02:00
Dragan Dosen	3e6690a555	CLEANUP: pattern: remove export of non-existent function pattern_delete()	2021-05-25 08:44:48 +02:00
Dragan Dosen	a75eea78e2	MINOR: map/acl: print the count of all the map/acl entries in "show map/acl" The output of "show map/acl" now contains the 'entry_cnt' value that represents the count of all the entries for each map/acl, not just the active ones, which means that it also includes entries currently being added.	2021-05-25 08:44:45 +02:00
Remi Tricot-Le Breton	2608e348be	BUG/MEDIUM: ebtree: Invalid read when looking for dup entry The first item inserted into an ebtree will be inserted directly below the root, which is a simple struct eb_root which only holds two branch pointers (left and right). If we try to find a duplicated entry to this first leaf through a ebmb_next_dup, our leaf_p pointer will point to the eb_root instead of a complete eb_node so we cannot look for the bit part of our leaf_p since it would try to cast our eb_root into an eb_node and perform an out of bounds access when reading "eb_root_to_node(eb_untag(t,EB_LEFT)))->bit". This bug was found by address sanitizer running on a CRL hot update VTC test. Note that the bug has been there since the import of the eb_next_dup() and eb_prev_dup() function in 1.5-dev19 by commit `2b5702030` ("MINOR: ebtree: add new eb_next_dup/eb_prev_dup() functions to visit duplicates"). It can be backported to all stable branches.	2021-05-18 19:26:21 +02:00
Remi Tricot-Le Breton	18c7d83934	BUILD/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 The following functions used in CA/CRL file hot update were not defined in OpenSSL 1.0.2 so they need to be defined in openssl-compat : - X509_CRL_get_signature_nid - X509_CRL_get0_lastUpdate - X509_CRL_get0_nextUpdate - X509_REVOKED_get0_serialNumber - X509_REVOKED_get0_revocationDate	2021-05-18 00:28:31 +02:00
Remi Tricot-Le Breton	a51b339d95	MEDIUM: ssl: Add "set+commit ssl crl-file" CLI commands This patch adds the "set ssl crl-file" and "commit ssl crl-file" commands, following the same logic as the certificate and CA file update equivalents. When trying to update a Certificate Revocation List (CRL) file via a "set" command, we start by looking for the entry in the CA file tree and then building a new cafile_entry out of the payload, without adding it to the tree yet. It will only be added when a "commit" command is called. During a "commit" command, we insert the newly built cafile_entry in the CA file tree while keeping the previous entry. We then iterate over all the instances that used the CRL file and rebuild a new one and its dedicated SSL context for every one of them. When all the contexts are properly created, the old instances get replaced by the new ones and the old CRL file is removed from the tree.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	0bb482436c	MINOR: ssl: Add a cafile_entry type field The CA files and CRL files are stored in the same cafile_tree so this patch adds a new field the the cafile_entry structure that specifies the type of the entry. Since a ca-file can also have some CRL sections, the type will be based on the option used to load the file and not on its content (ca-file vs crl-file options).	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	a32a68bd3b	MEDIUM: ssl: Add "set+commit ssl ca-file" CLI commands This patch adds the "set ssl ca-file" and "commit ssl ca-file" commands, following the same logic as the certificate update equivalents. When trying to update a ca-file entry via a "set" command, we start by looking for the entry in the cafile_tree and then building a new cafile_entry out of the given payload. This new object is not added to the cafile_tree until "commit" is called. During a "commit" command, we insert the newly built cafile_entry in the cafile_tree, while keeping the previous entry as well. We then iterate over all the instances linked in the old cafile_entry and rebuild a new ckch instance for every one of them. The newly inserted cafile_entry is used for all those new instances and their respective SSL contexts. When all the contexts are properly created, the old instances get replaced by the new ones and the old cafile_entry is removed from the tree. This fixes a subpart of GitHub issue #1057.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	38c999b11c	MINOR: ssl: Add helper function to add cafile entries Adds a way to insert a new uncommitted cafile_entry in the tree. This entry will be the one fetched by any lookup in the tree unless the oldest cafile_entry is explicitely looked for. This way, until a "commit ssl ca-file" command is completed, there could be two cafile_entries with the same path in the tree, the original one and the newly updated one.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	383fb1472e	MEDIUM: ssl: Add a way to load a ca-file content from memory The updated CA content coming from the CLI during a ca-file update will directly be in memory and not on disk so the way CAs are loaded in a cafile_entry for now (via X509_STORE_load_locations calls) cannot be used. This patch adds a way to fill a cafile_entry directly from memory and to load the contained certificate and CRL sections into an SSL store. CRL sections are managed as well as certificates in order to mimic the way CA files are processed when specified in an option. Indeed, when parsing a CA file given through a ca-file or ca-verify-file option, we iterate over the different sections in ssl_set_cert_crl_file and load them regardless of their type. This ensures that a file that was properly parsed when given as an option will also be accepted by the CLI.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	5daff3c8ab	MINOR: ssl: Add helper functions to create/delete cafile entries Add ssl_store_create_cafile_entry and ssl_store_delete_cafile_entry functions.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	40ddea8222	MINOR: ssl: Add reference to default ckch instance in bind_conf In order for the link between the cafile_entry and the default ckch instance to be built, we need to give a pointer to the instance during the ssl_sock_prepare_ctx call.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	4458b9732d	MEDIUM: ssl: Chain ckch instances in ca-file entries Each ca-file entry of the tree will now hold a list of the ckch instances that use it so that we can iterate over them when updating the ca-file via a cli command. Since the link between the SSL contexts and the CA file tree entries is only built during the ssl_sock_prepare_ctx function, which are called after all the ckch instances are created, we need to add a little post processing after each ssl_sock_prepare_ctx that builds the link between the corresponding ckch instance and CA file tree entries. In order to manage the ca-file and ca-verify-file options, any ckch instance can be linked to multiple CA file tree entries and any CA file entry can link multiple ckch instances. This is done thanks to a dedicated list of ckch_inst references stored in the CA file tree entries over which we can iterate (during an update for instance). We avoid having one of those instances go stale by keeping a list of references to those references in the instances. When deleting a ckch_inst, we can then remove all the ckch_inst_link instances that reference it, and when deleting a cafile_entry, we iterate over the list of ckch_inst reference and clear the corresponding entry in their own list of ckch_inst_link references.	2021-05-17 10:50:24 +02:00
Remi Tricot-Le Breton	af8820a9a5	CLEANUP: ssl: Move ssl_store related code to ssl_ckch.c This patch moves all the ssl_store related code to ssl_ckch.c since it will mostly be used there once the CA file update CLI commands are all implemented. It also makes the cafile_entry structure visible as well as the cafile_tree.	2021-05-17 10:50:24 +02:00
Willy Tarreau	1f97306ecc	[RELEASE] Released version 2.5-dev0 Released version 2.5-dev0 with the following main changes : - MINOR: version: it's development again	2021-05-14 09:36:37 +02:00
Willy Tarreau	1cb9fe7a75	MINOR: version: it's development again this essentially reverts `46fb37c70c`.	2021-05-14 09:36:08 +02:00
Willy Tarreau	46fb37c70c	MINOR: version: mention that it's LTS now. The version will be maintained up to around Q2 2026. Let's also update the INSTALL file to mention this.	2021-05-14 09:02:22 +02:00
Willy Tarreau	388fc25915	IMPORT: slz: use inttypes.h instead of stdint.h stdint.h is not as portable as inttypes.h. It doesn't exist at least on AIX 5.1 and Solaris 7, while inttypes.h is present there and does include stdint.h on platforms supporting it. This is equivalent to libslz upstream commit e36710a ("slz: use inttypes.h instead of stdint.h")	2021-05-14 08:44:52 +02:00
Willy Tarreau	9e274280a4	IMPORT: slz: do not produce the crc32_fast table when CRC is natively supported On ARM with native CRC support, no need to inflate the executable with a 4kB CRC table, let's just drop it. This is slz upstream commit d8715db20b2968d1f3012a734021c0978758f911.	2021-05-12 09:29:33 +02:00
Tim Duesterhus	dec1c36b3a	MINOR: uri_normalizer: Add `fragment-encode` normalizer This normalizer encodes '#' as '%23'. See GitHub Issue #714.	2021-05-11 17:24:32 +02:00
Tim Duesterhus	c9e05ab2de	MINOR: uri_normalizer: Add `fragment-strip` normalizer This normalizer strips the URI's fragment component which should never be sent to the server. See GitHub Issue #714.	2021-05-11 17:23:46 +02:00
Willy Tarreau	da7f11bfb5	CLEANUP: pattern: remove the unused and dangerous pat_ref_reload() This function was not used anymore after the atomic updates were implemented in 2.3, and it must not be used given that it does not yield and can easily make the process hang for tens of seconds on large acls/maps. Let's remove it before someone uses it as an example to implement something else!	2021-05-11 16:49:55 +02:00
Willy Tarreau	9bc457f0ea	BUILD: compat: include malloc_np.h for USE_MEMORY_PROFILING on FreeBSD This include is needed for malloc_usable_size(). It's also important to think about disabling global pools.	2021-05-09 23:46:45 +02:00
Willy Tarreau	92fbbcc4c6	MINOR: cli: sort the output of the "help" keywords It's still very difficult to find all commands starting with a given keyword like "set", "show" etc. Let's sort the lines by usage message, this is much more convenient.	2021-05-09 22:39:07 +02:00
Willy Tarreau	6b86d9e485	BUILD: errors: include stdarg in errors.h It's needed for va_list as defined in ha_vdiag_warning().	2021-05-09 12:11:41 +02:00
Willy Tarreau	2a8a2f0223	BUILD: ssl: define HAVE_CRYPTO_memcmp() based on the library version The build fails on versions older than 1.0.1d which is the first one introducing CRYPTO_memcmp(), so let's have a define for this instead of enabling it whenever USE_OPENSSL is set. One could also wonder why we're relying on openssl for such a trivial thing, and a simple local implementation could also allow to restore lexicographic ordering.	2021-05-09 12:10:36 +02:00
Willy Tarreau	714f34580e	DOC: fix a few remainig cases of "Haproxy" and "HAproxy" in doc and comments Some of the Lua doc and a few places still used "Haproxy" or "HAproxy". There was even one "HA proxy". A few of them were in an example of VTest output, indicating that VTest ought to be fixed as well. No big deal but better address all the remaining ones so that these inconsistencies stop spreading around.	2021-05-09 06:50:46 +02:00
Willy Tarreau	a219ec5cb2	BUILD: config: do not include proxy.h nor errors.h anymore in cfgparse.h These ones induce a long dependency chain and are not needed anymore.	2021-05-08 20:35:39 +02:00
Willy Tarreau	32840b77a5	BUILD: connection: stop including listener-t.h listener-t comes with openssl just due to the SSL_CTX type that is declred as a typedef in openssl hence cannot be abstracted at this level. However connection-t.h doen't need all that just to know that bind_conf is a struct. Let's declare it with other external types instead..	2021-05-08 20:27:08 +02:00
Willy Tarreau	08138612a4	REORG: config: uninline warnifnotcap() and failifnotcap() These ones are used by virtually every config parser. Not only they provide no benefit in being inlined, but they imply a very deep dependency starting at proxy.h, which results for example in task.c including openssl. Let's move these two functions to cfgparse.c.	2021-05-08 20:27:08 +02:00
Willy Tarreau	6ec1f25bc5	REORG: stick-table: move composite address functions to stick_table.h These caddr_* functions were once placed into tools.h in the hope they would be useful but nobody knows they exist. They could deserve being moved to their own file with other pointer manipulation functions maybe, but for now they're the only reason left for stick_table.h to include tools.h, so let's move them directly there since it's its only user. This allows to remove tools.h from stick_table.h and slightly reduce the overall build time.	2021-05-08 20:24:09 +02:00
Willy Tarreau	3b63ca20f4	REORG: stick-table: uninline stktable_alloc_data_type() This function has no business being inlined in stick_table.h since it's only used at boot time by the config parser. In addition it causes an undesired dependency on tools.h because it uses parse_time_err(). Let's move it to stick_table.c.	2021-05-08 20:24:09 +02:00
Willy Tarreau	e59b5169b3	BUILD: connection: move list_mux_proto() to connection.c No idea why this was put inlined into connection.h, it's used only once for haproxy -vv, and requires tools.h, causing an undesired dependency from connection.h. Let's move it to connection.c instead where it ought to have been.	2021-05-08 20:24:09 +02:00
Willy Tarreau	5703a38a06	BUILD: stick-table: include freq_ctr.h from stick_table.h It's needed for update_freq_ctr_period() which is used there.	2021-05-08 19:37:41 +02:00
Willy Tarreau	15f9ac3c59	REORG: mworker: move proc_self from global to mworker Only mworker uses proc_self, and it was declared in global.h, forcing users of global.h to include mworker and its dependencies. Moving it to mworker reduces the preprocessed size of version.c from 170 to 125kB by shrinking the number of local includes from 30 to 16 and the number of system includes from 147 to 132.	2021-05-08 12:34:44 +02:00
Willy Tarreau	29c460bc07	REORG: threads: move all_thread_mask() to thread.h It was declared in global.h, forcing plenty of source files to include it only for this while it's only based on definitions from thread.h.	2021-05-08 12:26:10 +02:00
Willy Tarreau	cfc4f24d80	REORG: vars: move the "proc" scope variables out of the global struct The presence of this field causes a long dependency chain because almost everyone includes global-t.h, and vars include sample_data which include some system includes as well as HTTP parts. There is absolutely no reason for having the process-wide variables in the global struct, let's just move them into vars.c and vars.h. This reduces from ~190k to ~170k the preprocessed output of version.c.	2021-05-08 12:11:29 +02:00
Willy Tarreau	2745620240	MINOR: stats: support an optional "float" option to "show info" This will allow some fields to be produced with a higher accuracy when the requester indicates being able to parse floats. Rates and times are among the elements which can make sense.	2021-05-08 10:52:12 +02:00
Willy Tarreau	0b26b3866c	MINOR: stats: pass the appctx flags to stats_fill_info() Currently the stats filling function knows nothing about the caller's needs, so let's pass the STAT_* flags so that it can adapt to the requester's constraints.	2021-05-08 10:52:12 +02:00
Willy Tarreau	aa33f20e27	MINOR: freq_ctr: add new functions to report float measurements For stats reporting it can be convenient to report floats at low rates instead of discrete integers. We do have quite some precision since we currently divide counters by number of milliseconds, so we can usually add 3 digits after the decimal point.	2021-05-08 10:48:17 +02:00
Willy Tarreau	ae03d26eea	MINOR: tools: add a float-to-ascii conversion function We already had ultoa_r() and friends but nothing to emit inline floats. This is now done with ftoa_r() and F2A/F2H. Note that the latter both use the itoa_str[] as temporary storage and that the HTML format currently is the exact same as the ASCII one. The trailing zeroes are always timmed so these outputs are usable in user-visible output.	2021-05-08 10:48:17 +02:00
Willy Tarreau	56d1d8dab0	MINOR: tools: implement trimming of floating point numbers When using "%f" to print a float, it automatically gets 6 digits after the decimal point and there's no way to automatically adjust to the required ones by dropping trailing zeroes. This function does exactly this and automatically drops the decimal point if all digits after it were zeroes. This will make numbers more friendly in stats and makes outputs shorter (e.g. JSON where everything is just a "number"). The function is designed to be easy to use with snprint() and chunks: snprintf: flt_trim(buf, 0, snprintf(buf, sizeof(buf), "%f", x)); chunk_printf: out->data = flt_trim(out->area, 0, chunk_printf(out, "%f", x)); chunk_appendf: size_t prev_data = out->data; out->data = flt_trim(out->area, prev_data, chunk_appendf(out, "%f", x));	2021-05-08 10:42:11 +02:00
Amaury Denoyelle	b979f59871	MINOR: proxy: define PR_CAP_LB Add a new proxy capability for proxy with load-balancing capabilities. This help to differentiate listen/frontend/backend with special proxies such as peer proxies.	2021-05-07 15:12:20 +02:00
Amaury Denoyelle	5dfdf3e5b0	MINOR: stats: report tainted on show info Add a new info field ST_F_TAINTED to dump tainted status at the end of the 'show info' output.	2021-05-07 14:35:02 +02:00
Amaury Denoyelle	f492992065	MINOR: cli: set tainted when using CLI expert/experimental mode Mark the process as tainted as soon as a command command only accessible in expert or experimental mode is executed.	2021-05-07 14:35:02 +02:00
Amaury Denoyelle	0351773534	MINOR: action: implement experimental actions Support experimental actions. It is mandatory to use 'expose-experimental-directives' before to be able to use them. If such action is present in the config file, the tainted status of the process is updated. Another tainted status is set when an experimental action is executed.	2021-05-07 14:35:02 +02:00
Amaury Denoyelle	e4a617c931	MINOR: action: replace match_pfx by a keyword flags field Define a new keyword flag KWF_MATCH_PREFIX. This is used to replace the match_pfx field of action struct. This has the benefit to have more explicit action declaration, and now it is possible to quickly implement experimental actions.	2021-05-07 14:35:01 +02:00
Amaury Denoyelle	d2e53cd47e	MINOR: cfgparse: implement experimental config keywords Add a new flag to mark a keyword as experimental. An experimental keyword cannot be used if the global 'expose-experimental-directives' is not present first. Only keywords parsed through a standard cfg_keywords lists in global/proxies section will be automatically detected if declared experimental. To support a keyword outside of these lists, check_kw_experimental must be called manually during its parsing. If an experimental keyword is present in the config, the tainted flag is updated. For the moment, no keyword is marked as experimental.	2021-05-07 14:34:41 +02:00
Amaury Denoyelle	fae9edf470	MINOR: cfgparse: add a new field flags in cfg_keyword This field will be used to add various mechanism to config parsing. Currently no flag value is implemented. The following commit will implement experimental keywords.	2021-05-07 14:12:27 +02:00
Amaury Denoyelle	484454d906	MINOR: global: define tainted flag Add a global flag named 'tainted'. Its purpose is to report various status about experimental features used for the current process lifetime. By default it is initialized to 0. It can be set/retrieve by a couple of new functions mark_tainted()/get_tainted(). Once a flag is set, it cannot be resetted. Currently, no tainted status is implemented, it will be the subject of the following commits.	2021-05-07 14:12:27 +02:00
Willy Tarreau	a43dfda4e1	MINOR: global: add version comparison functions The new function split_version() converts a parsable haproxy version to an array of integers. The function compare_current_version() compares an arbitrary version to the current one. These two functions were written by Thierry Fournier in 2013, and are still usable as-is. They will be used to write config language predicates.	2021-05-06 17:02:36 +02:00
Willy Tarreau	f0d3b732fb	MINOR: global: export the build features string list Till now it was only presented in the version output but could not be consulted outside of haproxy.c, let's export it as a variable, and set it to an empty string if not defined.	2021-05-06 17:02:36 +02:00
Willy Tarreau	5150805a5c	MINOR: config: keep up-to-date current file/line/section in the global struct Let's add a few fields to the global struct to store information about the current file being processed, the current line number and the current section. This will be used to retrieve them using special variables.	2021-05-06 10:35:03 +02:00
Christopher Faulet	d8219b31e7	MINOR: conn-stream: Force mux to wait for read events if abortonclose is set When the abortonclose option is enabled, to be sure to be immediately notified when a shutdown is received from the client, the frontend conn-stream must be sure the mux will wait for read events. To do so, the CO_RFL_KEEP_RECV flag is set when mux->rcv_buf() is called. This new flag instructs the mux to wait for read events, regardless its internal state. This patch is required to fix abortonclose option for H1 client connections.	2021-05-06 09:19:05 +02:00
Christopher Faulet	1c235e57d0	MINOR: channel: Rely on HTX version if appropriate in channel_may_recv() When channel_may_recv() is called for an HTX stream, the HTX version, channel_htx_may_recv() is called. This patch is mandatory to fix a bug related to the abortonclose option.	2021-05-06 09:19:05 +02:00
Willy Tarreau	00dd44f67f	MINOR: activity: add a "memory" entry to "profiling" This adds the necessary flags to permit run-time enabling/disabling of memory profiling. For now this is disabled. A few words were added to the management doc about it and recalling that this is limited to certain OSes.	2021-05-05 18:55:02 +02:00
Willy Tarreau	64192392c4	MINOR: tools: add functions to retrieve the address of a symbol get_sym_curr_addr() will return the address of the first occurrence of the given symbol while get_sym_next_addr() will return the address of the next occurrence of the symbol. These ones return NULL on non-linux, non-ELF, non-USE_DL.	2021-05-05 16:24:52 +02:00
Amaury Denoyelle	d3a88c1c32	MEDIUM: connection: close front idling connection on soft-stop Implement a safe mechanism to close front idling connection which prevents the soft-stop to complete. Every h1/h2 front connection is added in a new per-thread list instance. On shutdown, a new task is waking up which calls wake mux operation on every connection still present in the new list. A new stopping_list attach point has been added in the connection structure. As this member is only used for frontend connections, it shared the same union as the session_list reserved for backend connections.	2021-05-05 14:39:23 +02:00
Amaury Denoyelle	99cca08ecc	MINOR: connection: move session_list member in a union Move the session_list attach point in an anonymous union. This member is only used for backend connections. This commit is in preparation for the support of stopping frontend idling connections which will add another member to the union. This change means that a special care must be taken to be sure that only backend connections manipulate the session_list. A few BUG_ON has been added as special guard to prevent from misuse.	2021-05-05 14:35:36 +02:00
Willy Tarreau	1ab6c0bfd2	MINOR: pools/debug: slightly relax DEBUG_DONT_SHARE_POOLS The purpose of this debugging option was to prevent certain pools from masking other ones when they were shared. For example, task, http_txn, h2s, h1s, h1c, session, fcgi_strm, and connection are all 192 bytes and would normally be mergedi, but not with this option. The problem is that certain pools are declared multiple times with various parameters, which are often very close, and due to the way the option works, they're not shared either. Good examples of this are captures and stick tables. Some configurations have large numbers of stick-tables of pretty similar types and it's very common to end up with the following when the option is enabled: $ socat - /tmp/sock1 <<< "show pools" \| grep stick - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753800=56 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753880=57 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753900=58 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753980=59 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a00=60 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a80=61 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753b00=62 - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753780=55 In addition to not being convenient, it can have important effects on the memory usage because these pools will not share their entries, so one stick table cannot allocate from another one's pool. This patch solves this by going back to the initial goal which was not to have different pools in the same list. Instead of masking the MAP_F_SHARED flag, it simply adds a test on the pool's name, and disables pool sharing if the names differ. This way pools are not shared unless they're of the same name and size, which doesn't hinder debugging. The same test above now returns this: $ socat - /tmp/sock1 <<< "show pools" \| grep stick - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 7 users, @0x3fadb30 [SHARED] - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x3facaa0 [SHARED] This is much better. This should probably be backported, in order to limit the side effects of DEBUG_DONT_SHARE_POOLS being enabled in production.	2021-05-05 07:47:29 +02:00
Amaury Denoyelle	d272b409d7	BUILD: compiler: do not use already defined __read_mostly on dragonfly DragonflyBSD already has an attribute __read_mostly which serves the same purpose as the one in compiler.h. No need to be backported as it was added in the current 2.4-dev.	2021-04-30 17:16:36 +02:00
Willy Tarreau	a13afe6535	MINOR: pattern: support purging arbitrary ranges of generations Instead of being able to purge only values older than a specific value, let's support arbitrary ranges and make pat_ref_purge_older() just be one special case of this one.	2021-04-30 15:36:31 +02:00
Willy Tarreau	b4476c6a8c	CLEANUP: freq_ctr: make arguments of freq_ctr_total() const freq_ctr_total() doesn't modify the freq counters, it should take a const argument.	2021-04-28 17:44:37 +02:00
Christopher Faulet	8b604d1656	CLEANUP: channel: No longer notify the producer in co_skip()/co_htx_skip() Thanks to the commit "BUG/MINOR: applet: Notify the other side if data were consumed by an applet", it is no longer necessary to notify the producer when an applet skips output data. Now, it is the default applet handler responsibility to take care of that.	2021-04-28 11:08:35 +02:00
Christopher Faulet	260ec8e9a9	MINOR: htx: Limit length of headers name/value when a HTX message is dumped In htx_dump() function, we now limit the length of the headers name and the value to not fully print huge headers.	2021-04-28 10:51:08 +02:00
Christopher Faulet	2b78f0bfc4	CLEANUP: htx: Remove unsued hdrs_bytes field from the HTX start-line Thanks to the htx_xfer_blks() refactoring, it is now possible to remove hdrs_bytes field from the start-line because no function rely on it anymore.	2021-04-28 10:51:08 +02:00
Amaury Denoyelle	fc6ac53dca	BUG/MAJOR: fix build on musl with cpu_set_t support Move cpu_map structure outside of the global struct to a global variable defined in cpuset.c compilation unit. This allows to reorganize the includes without having to define _GNU_SOURCE everywhere for the support of the cpu_set_t. This fixes the compilation with musl libc, most notably used for the alpine based docker image. This fixes the github issue #1235. No need to backport as this feature is new in the current 2.4-dev.	2021-04-27 14:11:26 +02:00
Amaury Denoyelle	9463f0e222	BUG/MINOR: cpuset: move include guard at the very beginning The include guard in cpuset-t.h were misplaced and should be the first directive of the file. No need to backport.	2021-04-27 10:39:39 +02:00
Ilya Shipitsin	b2be9a1ea9	CLEANUP: assorted typo fixes in the code and comments This is 22nd iteration of typo fixes	2021-04-26 10:42:58 +02:00
Christopher Faulet	df3db630e4	REORG: htx: Inline htx functions to add HTX blocks in a message The HTX functions used to add new HTX blocks in a message have been moved to the header file to inline them in calling functions. These functions are small enough.	2021-04-26 10:24:57 +02:00
Christopher Faulet	fb38c910f8	BUG/MINOR: mux-fcgi: Don't send normalized uri to FCGI application A normalized URI is the internal term used to specify an URI is stored using the absolute format (scheme + authority + path). For now, it is only used for H2 clients. It is the default and recommended format for H2 request. However, it is unusual for H1 servers to receive such URI. So in this case, we only send the path of the absolute URI. It is performed for H1 servers, but not for FCGI applications. This patch fixes the difference. Note that it is not a real bug, because FCGI applications should support abosolute URI. Note also a normalized URI is only detected for H2 clients when a request is received. There is no such test on the H1 side. It means an absolute URI received from an H1 client will be sent without modification to an H1 server or a FCGI application. To make it possible, a dedicated function has been added to get the H1 URI. This function is called by the H1 and the FCGI multiplexer when a request is sent to a server. This patch should fix the issue #1232. It must be backported as far as 2.2.	2021-04-26 10:23:18 +02:00
Tim Duesterhus	2e4a18e04a	MINOR: uri_normalizer: Add a `percent-decode-unreserved` normalizer This normalizer decodes percent encoded characters within the RFC 3986 unreserved set. See GitHub Issue #714.	2021-04-23 19:43:45 +02:00
Emeric Brun	2cc201f97e	BUG/MEDIUM: peers: re-work refcnt on table to protect against flush In proxy.c, when process is stopping we try to flush tables content using 'stktable_trash_oldest'. A check on a counter "table->syncing" was made to verify if there is no pending resync in progress. But using multiple threads this counter can be increased by an other thread only after some delay, so the content of some tables can be trashed earlier and won't be pushed to the new process (after reload, some tables appear reset and others don't). This patch re-names the counter "table->syncing" to "table->refcnt" and the counter is increased during configuration parsing (registering a table to a peer section) to protect tables during runtime and until resync of a new process has succeeded or failed. The inc/dec operations are now made using atomic operations because multiple peer sections could refer to the same table in futur. This fix addresses github #1216. This patch should be backported on all branches multi-thread support (v >= 1.8)	2021-04-23 18:03:06 +02:00
Willy Tarreau	5020ffbe49	MINOR: time: avoid u64 needlessly expensive computations for the 32-bit now_ms The compiler cannot guess that tv_sec or tv_usec might have unused parts, so the multiply by 1000 and the divide by 1000 are both performed using 64-bit constants to stick to the common type. This is not needed since we only keep the final 32 bits, let's help the compiler here by casting these fields to uint. The tv_update_date() code is much cleaner (48 bytes smaller in the CAS loop) as it avoids some register spilling at a location where that's really unwanted.	2021-04-23 18:03:06 +02:00
Amaury Denoyelle	a6f9c5d2a7	BUG/MINOR: cpuset: fix compilation on platform without cpu affinity The compilation is currently broken on platform without USE_CPU_AFFINITY set. An error has been reported by the cygwin build of the CI. This does not need to be backported. In file included from include/haproxy/global-t.h:27, from include/haproxy/global.h:26, from include/haproxy/fd.h:33, from src/ev_poll.c:22: include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform" 32 \| # error "No cpuset support implemented on this platform" \| ^~~~~ include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’ 37 \| CPUSET_REPR cpuset; \| ^~~~~~~~~~~ make: * [Makefile:944: src/ev_poll.o] Error 1 make: * Waiting for unfinished jobs.... In file included from include/haproxy/global-t.h:27, from include/haproxy/global.h:26, from include/haproxy/fd.h:33, from include/haproxy/connection.h:30, from include/haproxy/ssl_sock.h:27, from src/ssl_sample.c:30: include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform" 32 \| # error "No cpuset support implemented on this platform" \| ^~~~~ include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’ 37 \| CPUSET_REPR cpuset; \| ^~~~~~~~~~~ make: *** [Makefile:944: src/ssl_sample.o] Error 1	2021-04-23 17:04:24 +02:00
Amaury Denoyelle	0f50cb9c73	MINOR: global: add option to disable numa detection Render numa detection optional with a global configuration statement 'no numa-cpu-mapping'. This can be used if the applied affinity of the algorithm is not optimal. Also complete the documentation with this new keyword.	2021-04-23 16:06:49 +02:00
Amaury Denoyelle	b56a7c89a8	MEDIUM: cfgparse: detect numa and set affinity if needed On process startup, the CPU topology of the machine is inspected. If a multi-socket CPU machine is detected, automatically define the process affinity on the first node with active cpus. This is done to prevent an impact on the overall performance of the process in case the topology of the machine is unknown to the user. This step is not executed in the following condition : - a non-null nbthread statement is present - a restrictive 'cpu-map' statement is present - the process affinity is already restricted, for example via a taskset call For the record, benchmarks were executed on a machine with 2 CPUs Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. In both clear and ssl scenario, the performance were sub-optimal without the automatic rebinding on a single node.	2021-04-23 16:06:49 +02:00
Amaury Denoyelle	a80823543c	MINOR: cfgparse: support the comma separator on parse_cpu_set Allow to specify multiple cpu ids/ranges in parse_cpu_set separated by a comma. This is optional and must be activated by a parameter. The comma support is disabled for the parsing of the 'cpu-map' config statement. However, it will be useful to parse files in sysfs when inspecting the cpus topology for NUMA automatic process binding.	2021-04-23 16:06:49 +02:00
Amaury Denoyelle	4c9efdecf5	MINOR: thread: implement the detection of forced cpu affinity Create a function thread_cpu_mask_forced. Its purpose is to report if a restrictive cpu mask is active for the current proces, for example due to a taskset invocation. It is only implemented for the linux platform currently.	2021-04-23 16:06:49 +02:00
Amaury Denoyelle	982fb53390	MEDIUM: config: use platform independent type hap_cpuset for cpu-map Use the platform independent type hap_cpuset for the cpu-map statement parsing. This allow to address CPU index greater than LONGBITS. Update the documentation to reflect the removal of this limit except for platforms without cpu_set_t type or equivalent.	2021-04-23 16:06:49 +02:00
Amaury Denoyelle	c90932bc8e	MINOR: cfgparse: use hap_cpuset for parse_cpu_set Replace the unsigned long parameter by a hap_cpuset. This allows to address CPU with index greater than LONGBITS. This function is used to parse the 'cpu-map' statement. However at the moment, the result is casted back to a long to store it in the global structure. The next step is to replace ulong in in cpu_map in the global structure with hap_cpuset.	2021-04-23 16:06:49 +02:00
Amaury Denoyelle	f75c640f7b	MINOR: cpuset: define a platform-independent cpuset type This module can be used to manipulate a cpu sets in a platform agnostic way. Use the type cpu_set_t/cpuset_t if available on the platform, or fallback to unsigned long, which limits de facto the maximum cpu index to LONGBITS.	2021-04-23 16:06:49 +02:00
Willy Tarreau	5e65f4276b	CLEANUP: compression: remove calls to SLZ init functions As we now embed the library we don't need to support the older 1.0 API any more, so we can remove the explicit calls to slz_make_crc_table() and slz_prepare_dist_table().	2021-04-22 16:11:19 +02:00
Willy Tarreau	12840be005	BUILD: compression: switch SLZ from out-of-tree to in-tree Now that SLZ is merged, let's update the makefile and compression files to use it. As a result, SLZ_INC and SLZ_LIB are neither defined nor used anymore. USE_SLZ is enabled by default ("USE_SLZ=default") and can be disabled by passing "USE_SLZ=" or by enabling USE_ZLIB=1. The doc was updated to reflect the changes.	2021-04-22 16:08:25 +02:00
Willy Tarreau	ab2b7828e2	IMPORT: slz: import slz into the tree SLZ is rarely packaged by distros and there have been complaints about the CPU and memory usage of ZLIB, leading to some suggestions to better address the issue by simply integrating SLZ into the tree (just 3 files). See discussions below: https://www.mail-archive.com/haproxy@formilux.org/msg38037.html https://www.mail-archive.com/haproxy@formilux.org/msg40079.html https://www.mail-archive.com/haproxy@formilux.org/msg40365.html This patch does just this, after minor adjustments to these files: - tables.h was renamed to slz-tables.h - tables.h had the precomputed tables removed since not used here - slz.c uses includes <import/slz> instead of "slz.h" The slz commit imported here was b06c172 ("slz: avoid a build warning with -Wimplicit-fallthrough"). No other change was performed either to SLZ nor to haproxy at this point so that this operation may be replicated if needed for a future version.	2021-04-22 15:50:41 +02:00
Maximilian Mader	ff3bb8b609	MINOR: uri_normalizer: Add a `strip-dot` normalizer This normalizer removes "/./" segments from the path component. Usually the dot refers to the current directory which renders those segments redundant. See GitHub Issue #714.	2021-04-21 12:15:14 +02:00
Willy Tarreau	2b71810cb3	CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion The current "ADD" vs "ADDQ" is confusing because when thinking in terms of appending at the end of a list, "ADD" naturally comes to mind, but here it does the opposite, it inserts. Several times already it's been incorrectly used where ADDQ was expected, the latest of which was a fortunate accident explained in `6fa922562` ("CLEANUP: stream: explain why we queue the stream at the head of the server list"). Let's use more explicit (but slightly longer) names now: LIST_ADD -> LIST_INSERT LIST_ADDQ -> LIST_APPEND LIST_ADDED -> LIST_INLIST LIST_DEL -> LIST_DELETE The same is true for MT_LISTs, including their "TRY" variant. LIST_DEL_INIT keeps its short name to encourage to use it instead of the lazier LIST_DELETE which is often less safe. The change is large (~674 non-comment entries) but is mechanical enough to remain safe. No permutation was performed, so any out-of-tree code can easily map older names to new ones. The list doc was updated.	2021-04-21 09:20:17 +02:00
Willy Tarreau	942b89f7dc	BUILD: pools: fix build with DEBUG_FAIL_ALLOC Amaury noticed that I managed to break the build of DEBUG_FAIL_ALLOC for the second time with `207c09509` ("MINOR: pools: move the fault injector to __pool_alloc()"). The joy of endlessly reworking patch sets... No backport is needed, that was in the just merged cleanup series.	2021-04-19 18:36:48 +02:00
Willy Tarreau	096b6cf581	CLEANUP: pools: declare dummy pool functions to remove some ifdefs By having a pair of dummy pool_get_from_cache() and pool_put_to_cache() we can remove some ugly ifdefs, so let's do this. We've already done it for the shared cache.	2021-04-19 15:24:33 +02:00
Willy Tarreau	b2a853d5f0	CLEANUP: pools: uninline pool_put_to_cache() This function has become too big (251 bytes) and is now hurting performance a lot, with up to 4% request rate being lost over the last pool changes. Let's move it to pool.c as a regular function. Other attempts were made to cut it in half but it's still inefficient. Doing this results in saving ~90kB of object code, and even 112kB since the pool changes, with code that is even slightly faster! Conversely, pool_get_from_cache(), which remains half of this size, is still faster inlined, likely in part due to the immediate use of the returned pointer afterwards.	2021-04-19 15:24:33 +02:00
Willy Tarreau	43d4ed548f	CLEANUP: pools: merge pool_{get_from,put_to}_local_caches with generic ones Since pool_get_from_cache() and pool_put_to_cache() were now only wrappers to the local cache versions which do all the job, let's merge them together so that there is no more local-cache specific function.	2021-04-19 15:24:33 +02:00
Willy Tarreau	d56db11447	CLEANUP: pools: make the local cache allocator fall back to the shared cache Now when pool_get_from_local_cache() fails, it automatically falls back to pool_get_from_shared_cache(), which used to always be done in pool_get_from_cache(). Thus now the API is simpler as we always allocate and free from/to the local caches.	2021-04-19 15:24:33 +02:00
Willy Tarreau	fa19d20ac4	MEDIUM: pools: make pool_put_to_cache() always call pool_put_to_local_cache() Till now it used to call it only if there were not too many objects into the local cache otherwise would send the latest one directly into the shared cache. Now it always sends to the local cache and it's up to the local cache to free its oldest objects. From a cache freshness perspective it's better this way since we always evict cold objects instead of hot ones. From an API perspective it's better because it will help make the shared cache invisible to the public API.	2021-04-19 15:24:33 +02:00
Willy Tarreau	147e1fa385	MINOR: pools: create unified pool_{get_from,put_to}_cache() These two functions are now responsible for allocating directly from the cache and releasing to the cache. Now the pool_alloc() function simply does this: if cache enabled return pool_alloc_from_cache() if no NULL return pool_alloc_nocache() otherwise and the pool_free() function does this: if cache enabled pool_put_to_cache() else pool_free_nocache() For now this only introduces these two functions without changing anything else, but the goal is to soon allow to make them implementation-specific.	2021-04-19 15:24:33 +02:00
Willy Tarreau	b8498e961a	MEDIUM: pools: make CONFIG_HAP_POOLS control both local and shared pools Continuing the unification of local and shared pools, now the usage of pools is governed by CONFIG_HAP_POOLS without which allocations and releases are performed directly from the OS using pool_alloc_nocache() and pool_free_nocache().	2021-04-19 15:24:33 +02:00
Willy Tarreau	45e4e28161	MINOR: pools: factor the release code into pool_put_to_os() There are two levels of freeing to the OS: - code that wants to keep the pool's usage counters updated uses pool_free_area() and handles the counters itself. That's what pool_put_to_shared_cache() does in the no-global-pools case. - code that does not want to update the counters because they were already updated only calls pool_free_area(). Let's extract these calls to establish the symmetry with pool_get_from_os() and pool_alloc_nocache(), resulting in pool_put_to_os() (which only updates the allocated counter) and pool_free_nocache() (which also updates the used counter). This will later allow to simplify the generic code.	2021-04-19 15:24:33 +02:00
Willy Tarreau	acf0c54491	MINOR: pools: move pool_free_area() out of the lock in the locked version Calling pool_free_area() inside a lock in pool_put_to_shared_cache() is a very bad idea. Fortunately this only happens on the lowest end platforms which almost never use threads or in very small counts. This change consists in zeroing the pointer once already released to the cache in the first test so that the second stage knows if it needs to pass it to the OS or not. This has slightly reduced the length of the	2021-04-19 15:24:33 +02:00
Willy Tarreau	2b5579f6da	MINOR: pools: always use atomic ops to maintain counters A part of the code cannot be factored out because it still uses non-atomic inc/dec for pool->used and pool->allocated as these are located under the pool's lock. While it can make sense in terms of bus cycles, it does not make sense in terms of code normalization. Further, some operations were still performed under a lock that could be totally removed via the use of atomic ops. There is still one occurrence in pool_put_to_shared_cache() in the locked code where pool_free_area() is called under the lock, which must absolutely be fixed.	2021-04-19 15:24:33 +02:00
Willy Tarreau	13843641e5	MINOR: pools: split the OS-based allocator in two Now there's one part dealing with the allocation itself and keeping counters up to date, and another one on top of it to return such an allocated pointer to the user and update the use count and stats. This is in anticipation for being able to group cache-related parts. The release code is still done at once.	2021-04-19 15:24:33 +02:00
Willy Tarreau	207c095098	MINOR: pools: move the fault injector to __pool_alloc() Till now it was limited to objects allocated from the OS which means it had little use as soon as pools were enabled. Let's move it upper in the layers so that any code can benefit from fault injection. In addition this allows to pass a new flag POOL_F_NO_FAIL to disable it if some callers prefer a no-failure approach.	2021-04-19 15:24:33 +02:00
Willy Tarreau	84ebfabf7f	MINOR: tools: add statistical_prng_range() to get a random number over a range This is simply a multiply and shift from statistical_prng() but it's made easily accessible.	2021-04-19 15:24:33 +02:00
Willy Tarreau	635cced32f	CLEANUP: pools: rename __pool_free() to pool_put_to_shared_cache() Now the multi-level cache becomes more visible: pool_get_from_local_cache() pool_put_to_local_cache() pool_get_from_shared_cache() pool_put_to_shared_cache()	2021-04-19 15:24:33 +02:00
Willy Tarreau	8c77ee5ae5	CLEANUP: pools: rename pool__{from,to}_cache() to _local_cache() The functions were rightfully called from/to_cache when the thread-local cache was considered as the only cache, but this is getting terribly confusing. Let's call them from/to local_cache to make it clear that it is not related with the shared cache. As a side note, since pool_evict_from_cache() used not to work for a particular pool but for all of them at once, it was renamed to pool_evict_from_local_caches() (plural form).	2021-04-19 15:24:33 +02:00
Willy Tarreau	2f03dcde91	CLEANUP: pools: rename __pool_get_first() to pool_get_from_shared_cache() This is exactly what it is, the entry is retrieved from the shared cache when it is defined. The implementation that is enabled with CONFIG_HAP_NO_GLOBAL_POOLS continues to return NULL.	2021-04-19 15:24:33 +02:00
Willy Tarreau	2543211830	CLEANUP: pools: move the lock to the only __pool_get_first() that needs it Now that __pool_alloc() only surrounds __pool_get_first() with the lock, let's move it to the only variant that requires it and remove the ugly ifdefs from the function. This is safe because nobody else calls this function.	2021-04-19 15:24:33 +02:00
Willy Tarreau	8ee9df57db	MINOR: pools: call pool_alloc_nocache() out of the pool's lock In __pool_alloc(), historically we used to use factor out the pool's lock between __pool_get_first() and __pool_refill_alloc(), resulting in real malloc() or mmap() calls being performed under the pool lock (for platforms using the locked shared pools). As this is not needed anymore, let's move the call out of the lock, it may improve allocation patterns on some platforms. This also makes __pool_alloc() cleaner as we see a first attempt to allocate from the local cache, then a second from the shared cache then a reall allocation.	2021-04-19 15:24:33 +02:00
Willy Tarreau	8fe726f118	CLEANUP: pools: re-merge pool_refill_alloc() and __pool_refill_alloc() They were strictly equivalent, let's remerge them and rename them to pool_alloc_nocache() as it's the call which performs a real allocation which does not check nor update the cache. The only difference in the past was the former taking the lock and not the second but now the lock is not needed anymore at this stage since the pool's list is not touched. In addition, given that the "avail" argument is no longer used by the function nor by its callers, let's drop it.	2021-04-19 15:24:33 +02:00
Willy Tarreau	64383b8181	MINOR: pools: make the basic pool_refill_alloc()/pool_free() update needed_avg This is a first step towards unifying all the fallback code. Right now these two functions are the only ones which do not update the needed_avg rate counter since there's currently no shared pool kept when using them. But their code is similar to what could be used everywhere except for this one, so let's make them capable of maintaining usage statistics. As a side effect the needed field in "show pools" will now be populated.	2021-04-19 15:24:33 +02:00
Willy Tarreau	53a7fe49aa	MINOR: pools: enable the fault injector in all allocation modes The mem_should_fail() call enabled by DEBUG_FAIL_ALLOC used to be placed only in the no-cache version of the allocator. Now we can generalize it to all modes and remove the exclusive test on CONFIG_HAP_NO_GLOBAL_POOLS.	2021-04-19 15:24:33 +02:00
Willy Tarreau	2d6f628d34	MINOR: pools: rename CONFIG_HAP_LOCAL_POOLS to CONFIG_HAP_POOLS We're going to make the local pool always present unless pools are completely disabled. This means that pools are always enabled by default, regardless of the use of threads. Let's drop this notion of "local" pools and make it just "pool". The equivalent debug option becomes DEBUG_NO_POOLS instead of DEBUG_NO_LOCAL_POOLS. For now this changes nothing except the option and dropping the dependency on USE_THREAD.	2021-04-19 15:24:33 +02:00
Willy Tarreau	d5140e7c6f	MINOR: pool: remove the size field from pool_cache_head Everywhere we have access to the pool so we don't need to cache a copy of the pool's size into the pool_cache_head. Let's remove it.	2021-04-19 15:24:33 +02:00
Willy Tarreau	9f3129e583	MEDIUM: pools: move the cache into the pool header Initially per-thread pool caches were stored into a fixed-size array. But this was a bit ugly because the last allocated pools were not able to benefit from the cache at all. As a work around to preserve performance, a size of 64 cacheable pools was set by default (there are 51 pools at the moment, excluding any addon and debugging code), so all in-tree pools were covered, at the expense of higher memory usage. In addition an index had to be calculated for each pool, and was used to acces the pool cache head into that array. The pool index was not even stored into the pools so it was required to determine it to access the cache when the pool was already known. This patch changes this by moving the pool cache head into the pool head itself. This way it is certain that each pool will have its own cache. This removes the need for index calculation. The pool cache head is 32 bytes long so it was aligned to 64B to avoid false sharing between threads. The extra cost is not huge (~2kB more per pool than before), and we'll make better use of that space soon. The pool cache head contains the size, which should probably be removed since it's already in the pool's head.	2021-04-19 15:24:33 +02:00
Willy Tarreau	fff96b441f	CLEANUP: pools: remove unused arguments to pool_evict_from_cache() In commit `fb117e6a8` ("MEDIUM: memory: don't let pool_put_to_cache() free the objects itself") pool_evict_from_cache() was introduced with no argument, yet the only call place passes it the pool, the pointer and the index number! Let's remove these as they even let the reader think that the function does something specific to the current pool while it's not the case.	2021-04-19 15:24:33 +02:00
Tim Duesterhus	5be6ab269e	MEDIUM: http_act: Rename uri-normalizers This patch renames all existing uri-normalizers into a more consistent naming scheme: 1. The part of the URI that is being touched. 2. The modification being performed as an explicit verb.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	a407193376	MINOR: uri_normalizer: Add a `percent-upper` normalizer This normalizer uppercases the hexadecimal characters used in percent-encoding. See GitHub Issue #714.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	d7b89be30a	MINOR: uri_normalizer: Add a `sort-query` normalizer This normalizer sorts the `&` delimited query parameters by parameter name. See GitHub Issue #714.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	560e1a6352	MINOR: uri_normalizer: Add support for supressing leading `../` for dotdot normalizer This adds an option to supress `../` at the start of the resulting path.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	9982fc2bbd	MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request normalize-uri This normalizer merges `../` path segments with the predecing segment, removing both the preceding segment and the `../`. Empty segments do not receive special treatment. The `merge-slashes` normalizer should be executed first. See GitHub Issue #714.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	d371e99d1c	MINOR: uri_normalizer: Add a `merge-slashes` normalizer to http-request normalize-uri This normalizer merges adjacent slashes into a single slash, thus removing empty path segments. See GitHub Issue #714.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	d2bedcc4ab	MINOR: uri_normalizer: Add `http-request normalize-uri` This patch adds the `http-request normalize-uri` action that was requested in GitHub issue #714. Normalizers will be added in the next patches.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	0ee1ad5675	MINOR: uri_normalizer: Add `enum uri_normalizer_err` This enum will serve as the return type for each normalizer.	2021-04-19 09:05:57 +02:00
Tim Duesterhus	dbd25c34de	MINOR: uri_normalizer: Add uri_normalizer module This is in preparation for future patches.	2021-04-19 09:05:57 +02:00
Christopher Faulet	76b44195c9	MINOR: threads: Only consider running threads to end a thread harmeless period When a thread ends its harmeless period, we must only consider running threads when testing threads_want_rdv_mask mask. To do so, we reintroduce all_threads_mask mask in the bitwise operation (It was removed to fix a deadlock). Note that for now it is useless because there is no way to stop threads or to have threads reserved for another task. But it is safer this way to avoid bugs in the future.	2021-04-17 11:14:58 +02:00
Christopher Faulet	f63a185500	BUG/MEDIUM: threads: Ignore current thread to end its harmless period A previous patch was pushed to fix a deadlock when an isolated thread ends its harmless period (`a9a9e9aac` ["BUG/MEDIUM: thread: Fix a deadlock if an isolated thread is marked as harmless"]). But, unfortunately, the fix is incomplete. The same must be done in the outer loop, in thread_harmless_end() function. The current thread must be ignored when threads_want_rdv_mask mask is tested. This patch must also be backported as far as 2.0.	2021-04-17 11:14:58 +02:00
Alex	41007a6835	MINOR: sample: converter: Add mjson library. This library is required for the subsequent patch which adds the JSON query possibility. It is necessary to change the include statement in "src/mjson.c" because the imported includes in haproxy are in "include/import" orig: #include "mjson.h" new: #include <import/mjson.h>	2021-04-15 17:05:38 +02:00
Tim Duesterhus	763342646f	MINOR: ist: Add `istclear(struct ist*)` istclear allows one to easily reset an ist to zero-size, while preserving the previous size, indicating the length of the underlying buffer.	2021-04-14 19:49:33 +02:00
Moemen MHEDHBI	92f7d43c5d	MINOR: sample: add ub64dec and ub64enc converters ub64dec and ub64enc are the base64url equivalent of b64dec and base64 converters. base64url encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. It is also used in in JWT (JSON Web Token) standard. RFC1421 mention in base64.c file is deprecated so it was replaced with RFC4648 to which existing converters, base64/b64dec, still apply. Example: HAProxy: http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec] Client: Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg $ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local {"user":"foo","key":"chae6AhXai6e"}	2021-04-13 17:28:13 +02:00
Christopher Faulet	0c6d1dcf7d	BUG/MINOR: listener: Handle allocation error when allocating a new bind_conf Allocation error are now handled in bind_conf_alloc() functions. Thus callers, when not already done, are also updated to catch NULL return value. This patch may be backported (at least partially) to all stable versions. However, it only fix errors durung configuration parsing. Thus it is not mandatory.	2021-04-12 21:33:43 +02:00
Christopher Faulet	147b8c919c	MINOIR: checks/trace: Register a new trace source with its events Add the trace support for the checks. Only tcp-check based health-checks are supported, including the agent-check. In traces, the first argument is always a check object. So it is easy to get all info related to the check. The tcp-check ruleset, the conn-stream and the connection, the server state...	2021-04-12 12:09:36 +02:00
Christopher Faulet	6d80b63e3c	MINOR: trace: Add the checks as a possible trace source To be able to add the trace support for the checks, a new kind of source must be added for this purpose.	2021-04-12 12:09:36 +02:00
Willy Tarreau	7b1425a91b	MINOR: atomic: reimplement the relaxed version of x86 BTS/BTR Olivier spotted that I messed up during a rebase of commit `92c059c2a` ("MINOR: atomic: implement native BTS/BTR for x86"), losing the x86 version of the BTS/BTR and leaving the generic version for it instead of having this block in the #else. Since this variant is not used for now it was easy to overlook it. Let's re-implement it here.	2021-04-12 10:01:44 +02:00
Willy Tarreau	c4c80fb4ea	MINOR: time: move the time initialization out of tv_update_date() The time initialization was made a bit complex because we rely on a dummy negative argument to reset all fields, leaving no distinction between process-level initialization and thread-level initialization. This patch changes this by introducing two functions, one for the process and the second one for the threads. This removes ambigous test and makes sure that the relevant fields are always initialized exactly once. This also offers a better solution to the bug fixed in commit `b48e7c001` ("BUG/MEDIUM: time: make sure to always initialize the global tick") as there is no more special values for global_now_ms. It's simple enough to be backported if any other time-related issues are encountered in stable versions in the future.	2021-04-11 23:45:48 +02:00
Willy Tarreau	61c72c366e	CLEANUP: time: remove the now unused ms_left_scaled It was only used by freq_ctr and is not used anymore. In addition the local curr_sec_ms was removed, as well as the equivalent extern definitions which did not exist anymore either.	2021-04-11 14:01:53 +02:00
Willy Tarreau	d46ed5c26b	MINOR: freq_ctr: simplify and improve the update function update_freq_ctr_period() was still not very clean and didn't wait for the rotation lock to be dropped before trying again, thus maintaining the contention at a high level. In addition, the rotation update was made in three steps, which are not very efficient in terms of bus cycles. Here the wait loop was reworked so that the fast path remains short and that the contended path waits for the lock to be dropped before attempting another write, but it only waits a relax cycle before attempting a read. The rotation block was simplified to remove a test that was already validated by the first loop, and so that the retrieval of the current period, its reset and its increment are all performed in a single atomic op and the store to the previous period is performed immediately after. All this results in significantly smaller code for the inline function (~1kB total) and a shorter critical path.	2021-04-11 14:01:53 +02:00
Willy Tarreau	6339c19cac	MINOR: freq_ctr: add cpu_relax in the rotation loop of update_freq_ctr_period() When counters are rotated, there is contention between the threads which can slow down the operation of the thread performing the rotation. Let's apply a cpu_relax there to let the first thread finish faster.	2021-04-11 11:12:57 +02:00
Willy Tarreau	fc6323ad82	MEDIUM: freq_ctr: replace the per-second counters with the generic ones It remains cumbersome to preserve two versions of the freq counters and two different internal clocks just for this. In addition, the savings from using two different mechanisms are not that important as the only saving is a divide that is replaced by a multiply, but now thanks to the freq_ctr_total() unificaiton the code could also be simplified to optimize it in case of constants. This patch turns all non-period freq_ctr functions to static inlines which call the period-based ones with a period of 1 second. A direct benefit is that a single internal clock is now needed for any counter and that they now all rely on ticks. These 1-second counters are essentially used to report request rates and to enforce a connection rate limitation in listeners. It was verified that these continue to work like before.	2021-04-11 11:12:55 +02:00
Willy Tarreau	fa1258f02c	MINOR: freq_ctr: unify freq_ctr and freq_ctr_period into freq_ctr Both structures are identical except the name of the field starting the period and its description. Let's call them all freq_ctr and the period's start "curr_tick" which is generic. This is only a temporary change and fields are expected to remain the same with no code change (verified).	2021-04-11 11:11:27 +02:00
Willy Tarreau	d209c87142	MINOR: freq_ctr: add the missing next_event_delay_period() There was still no function to compute a wait time for periods, let's implement it on top of freq_ctr_total() as we'll soon need it for the per-second one. The divide here is applied on the frequency so that it will be replaced with a reciprocal multiply when constant.	2021-04-11 11:11:03 +02:00
Willy Tarreau	607be24a85	MEDIUM: freq_ctr: reimplement freq_ctr_remain_period() from freq_ctr_total() Now the function becomes an inline one and only contains a divide and a max. The divide will automatically go away with constant periods.	2021-04-11 11:11:03 +02:00
Willy Tarreau	a7a31b2602	MEDIUM: freq_ctr: make read_freq_ctr_period() use freq_ctr_total() This one is the easiest to implement, it just requires a call and a divide of the result. Anti-flapping correction for low-rates was preserved. Now calls using a constant period will be able to use a reciprocal multiply for the period instead of a divide.	2021-04-11 11:11:03 +02:00
Willy Tarreau	f3a9f8dc5a	MINOR: freq_ctr: add a generic function to report the total value Most of the functions designed to read a counter over a period go through the same complex loop and only differ in the way they use the returned values, so it was worth implementing all this into freq_ctr_total() which returns the total number of events over a period so that the caller can finish its operation using a divide or a remaining time calculation. As a special case, read_freq_ctr_period() doesn't take pending events but requires to enable an anti-flapping correction at very low frequencies. Thus the function implements it when pend<0. Thanks to this function it will be possible to reimplement the other ones as inline and merge the per-second ones with the arbitrary period ones without always adding the cost of a 64 bit divide.	2021-04-11 11:10:57 +02:00
Willy Tarreau	ff88270ef9	MINOR: pool: move pool declarations to read_mostly All pool heads are accessed via a pointer and should not be shared with highly written variables. Move them to the read_mostly section.	2021-04-10 19:27:41 +02:00
Willy Tarreau	f459640ef6	MINOR: global: declare a read_mostly section Some variables are mostly read (mostly pointers) but they tend to be merged with other ones in the same cache line, slowing their access down in multi-thread setups. This patch declares an empty, aligned variable in a section called "read_mostly". This will force a cache-line alignment on this section so that any variable declared in it will be certain to avoid false sharing with other ones. The section will be eliminated at link time if not used. A __read_mostly attribute was added to compiler.h to ease use of this section.	2021-04-10 19:27:41 +02:00
Willy Tarreau	ba386f6a8d	CLEANUP: initcall: rely on HA_SECTION_* instead of defining its own Now initcalls are defined using the regular section definitions from compiler.h in order to ease maintenance.	2021-04-10 19:27:41 +02:00
Willy Tarreau	5bec4c42ed	MINOR: compiler: add macros to declare section names HA_SECTION() is used as an attribute to force a section name. This is required because OSX prepends "__DATA, " in front of the declaration. HA_SECTION_START() and HA_SECTION_STOP() are used as post-attribute on variable declaration to designate the section start/end (needed only on OSX, empty on others). For platforms with an obsolete linker, all macros are left empty. It would possibly still work on some of them but this will not be needed anyway.	2021-04-10 19:27:41 +02:00
Willy Tarreau	731f0c6502	CLEANUP: initcall: rename HA_SECTION to HA_INIT_SECTION The HA_SECTION name is too generic and will be reused globally. Let's rename this one.	2021-04-10 19:27:41 +02:00
Willy Tarreau	afa9bc0ec5	MINOR: initcall: uniformize the section names between MacOS and other unixes Due to length restrictions on OSX the initcall sections are called "i_" there while they're called "init_" on other OSes. However the start and end of sections are still called "__start_init_" and "__stop_init_", which forces to have distinct code between the OSes. Let's switch everyone to "i_" and rename the symbols accordingly.	2021-04-10 19:27:41 +02:00
Willy Tarreau	ad14c2681b	MINOR: trace: replace the trace() inline function with an equivalent macro The trace() function is convenient to avoid calling trace() when traces are not enabled, but there starts to be some callers which place complex expressions in their trace calls, which results in all of them to be evaluated before being passed as arguments to the trace() function. This needlessly wastes precious CPU cycles. Let's change the function for a macro, so that the arguments are now only evaluated when the surce has traces enabled. However having a generic macro being called "trace()" can easily cause conflicts with innocent code so we rename it "_trace". Just doing this has resulted in a 2.5% increase of the HTTP/1 request rate.	2021-04-10 19:27:41 +02:00
Willy Tarreau	9057a0026e	CLEANUP: pattern: make all pattern tables read-only Interestingly, all arrays used to declare patterns were read-write while only hard-coded. Let's mark them const so that they move from data to rodata and don't risk to experience false sharing.	2021-04-10 17:49:41 +02:00
Tim Duesterhus	403fd722ac	CLEANUP: Remove useless malloc() casts This is not C++.	2021-04-08 20:11:58 +02:00
Tim Duesterhus	fea59fcf79	CLEANUP: ist: Remove unused `count` argument from `ist2str*` This argument is not being used inside the function (and the functions themselves are unused as well) and not documented. Its purpose is not clear. Just remove it.	2021-04-08 19:40:59 +02:00
Tim Duesterhus	b8ee894b66	CLEANUP: htx: Make http_get_stline take a `const struct` Nothing is being modified there, so this can be `const`.	2021-04-08 19:40:59 +02:00
Tim Duesterhus	fbc2b79743	MINOR: ist: Rename istappend() to __istappend() Indicate that this function is not inherently safe by adding two underscores as a prefix.	2021-04-08 19:35:52 +02:00
Willy Tarreau	1197459e0a	BUG/MAJOR: fd: switch temp values to uint in fd_stop_both() With latest commit `f50906519` ("MEDIUM: fd: merge fdtab[].ev and state for FD_EV_* and FD_POLL_* into state") one occurrence of a pair of chars was missed in fd_stop_both(), resulting in the operation to fail if the upper flags were set. Interestingly it managed to fail 2 tests in all setups in the CI while all used to work fine on my local machines. Probably that the reason is that the chars had enough room above them for the CAS to fail then refill "old" overwriting the upper parts of the stack, and that thanks to this the subsequent tests worked. With ASAN being used on lots of tests, it very likely caught it but used to only report failed tests with no more info. No backport is needed, as this was never released nor backported.	2021-04-07 20:46:26 +02:00
Tim Duesterhus	8daf8dceb9	MINOR: ist: Add `istsplit(struct ist*, char)` istsplit is a combination of iststop + istadv.	2021-04-07 19:50:43 +02:00
Tim Duesterhus	90aa8c7f02	MINOR: ist: Add `istshift(struct ist*)` istshift() returns the first character and advances the ist by 1.	2021-04-07 19:50:43 +02:00
Tim Duesterhus	551eeaec91	MINOR: ist: Add `istappend(struct ist, char)` This function appends the given char to the given `ist` and returns the resulting `ist`.	2021-04-07 19:50:43 +02:00
Willy Tarreau	92c059c2ac	MINOR: atomic: implement native BTS/BTR for x86 The current BTS/BTR operations on x86 are ugly because they rely on a CAS, so they may be unfair and take time to converge. Fortunately, where they are currently used (mostly FDs) the contention is expected to be rare (mostly listeners). But this also limits their use to such few low-load cases. On x86 there is a set of BTS/BTR instructions which help for this, but before the FD's state migrated to 32 bits there was little use of them since they do not exist in 8 bits. Now at least it makes sense to use them, at the very least in order to significantly reduce the code size (one BTS instead of a CMPXCHG loop). The implementation relies on modern gcc's ability to return condition flags and limit code inflation and register spilling. The fall back is retained on the old implementation for all other situations (inappropriate target size or non-capable compiler). The code shrank by 1.6 kB on the fast path. As expected, for now on up to 4 threads there is no measurable difference of performance.	2021-04-07 18:47:22 +02:00
Willy Tarreau	fa68d2641b	CLEANUP: atomic: use the __atomic variant of BTS/BTR on modern compilers Probably due to the result of an old copy-paste, HA_ATOMIC_BTS/BTR were still implemented using the __sync_* builtins instead of the more modern __atomic_* which allow to specify the memory model. Let's update this to use the newer there and also implement the relaxed variants (which are not used for now).	2021-04-07 18:18:37 +02:00
Willy Tarreau	4781b1521a	CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1) or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.	2021-04-07 18:18:37 +02:00
Willy Tarreau	22d675cb77	CLEANUP: atomic: add HA_ATOMIC_INC/DEC for unit increments Most ADD/SUB callers use them for a single unit (e.g. refcounts) and it's a pain to always pass ",1". Let's add them to simplify the API. However we currently don't add any return value. If needed in the future better report zero/non-zero than a real value for the sake of efficiency at the instruction level.	2021-04-07 18:18:37 +02:00
Willy Tarreau	185157201c	CLEANUP: atomic: add a fetch-and-xxx variant for common operations The fetch_and_xxx variant is often missing for add/sub/and/or. In fact it was only provided for ADD under the name XADD which corresponds to the x86 instruction name. But for destructive operations like AND and OR it's missing even more as it's not possible to know the value before modifying it. This patch explicitly adds HA_ATOMIC_FETCH_{OR,AND,ADD,SUB} which cover these standard operations, and renames XADD to FETCH_ADD (there were only 6 call places). In the future, backport of fixes involving such operations could simply remap FETCH_ADD(x) to XADD(x), FETCH_SUB(x) to XADD(-x), and for the OR/AND if needed, these could possibly be done using BTS/BTR. It's worth noting that xchg could have been renamed to fetch_and_store() but xchg already has well understood semantics and it wasn't needed to go further.	2021-04-07 18:18:37 +02:00
Willy Tarreau	a477150fd7	CLEANUP: atomic: make all standard add/or/and/sub operations return void In order to make sure these ones will not be used anymore in an expression, let's make them always void. New callers will now be forced to use the explicit _FETCH variant if required.	2021-04-07 18:18:37 +02:00
Willy Tarreau	1db427399c	CLEANUP: atomic: add an explicit _FETCH variant for add/sub/and/or Currently our atomic ops return a value but it's never known whether the fetch is done before or after the operation, which causes some confusion each time the value is desired. Let's create an explicit variant of these operations suffixed with _FETCH to explicitly mention that the fetch occurs after the operation, and make use of it at the few call places.	2021-04-07 18:18:37 +02:00
Willy Tarreau	6756d95a8e	MINOR: atomic/arm64: detect and use builtins for the double-word CAS Gcc 10.2 implements outline atomics on aarch64. The replace all inline atomic ops with a function call that checks if the machine supports LSE atomics. This comes with a small cost but allows modern machines to scale much better than with the old LL/SC ones even when built for full 8.0 compatibility. This patch enables the use of the __atomic_compare_exchange() builtin for the double-word CAS when detected as available instead of using the hand-written LL/SC version. The extra cost is negligible because we do very few DWCAS operations (essentially FD migrations and shared pools) so the cost is low but under high contention it can still be beneficial. As expected no performance difference was measured in either direction on 4-core machines with this change. This could be backported to 2.3 if it was shown that FD migrations were representing a significant source of contention, but for now it does not appear to be needed.	2021-04-07 18:18:37 +02:00
Willy Tarreau	1673c4a883	MINOR: fd: implement an exclusive syscall bit to remove the ugly "log" lock There is a function called fd_write_frag_line() that's essentially used by loggers and that is used to write an atomic message line over a file descriptor using writev(). However a lock is required around the writev() call to prevent messages from multiple threads from being interleaved. Till now a SPIN_TRYLOCK was used on a dedicated lock that was common to all FDs. This is quite not pretty as if there are multiple output pipes to collect logs, there will be quite some contention. Now that there are empty flags left in the FD state and that we can finally use atomic ops on them, let's add a flag to indicate the FD is locked for exclusive access by a syscall. At least the locking will now be on an FD basis and not the whole process, so we can remove the log_lock.	2021-04-07 18:18:37 +02:00
Willy Tarreau	9063a660cc	MINOR: fd: move .exported into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:10:36 +02:00
Willy Tarreau	5362bc9044	MINOR: fd: move .et_possible into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:09:43 +02:00
Willy Tarreau	0cc612818d	MINOR: fd: move .initialized into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state. The bit was not cleared in fd_insert() because the only user is the function used to create and atomically send a log message to a pipe FD, which never registers the fd. Here we clear it nevertheless for the sake of clarity. Note that with an extra cleaning pass we could have a bit number here and simply use a BTS to test and set it.	2021-04-07 18:09:08 +02:00
Willy Tarreau	030dae13a0	MINOR: fd: move .cloned into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:08:29 +02:00
Willy Tarreau	b41a6e9101	MINOR: fd: move .linger_risk into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state. The CLI's output state was extended to 6 digits and the linger/cloned flags moved inside the parenthesis.	2021-04-07 18:07:49 +02:00
Willy Tarreau	f509065191	MEDIUM: fd: merge fdtab[].ev and state for FD_EV_* and FD_POLL_* into state For a long time we've had fdtab[].ev and fdtab[].state which contain two arbitrary sets of information, one is mostly the configuration plus some shutdown reports and the other one is the latest polling status report which also contains some sticky error and shutdown reports. These ones used to be stored into distinct chars, complicating certain operations and not even allowing to clearly see concurrent accesses (e.g. fd_delete_orphan() would set the state to zero while fd_insert() would only set the event to zero). This patch creates a single uint with the two sets in it, still delimited at the byte level for better readability. The original FD_EV_* values remained at the lowest bit levels as they are also known by their bit value. The next step will consist in merging the remaining bits into it. The whole bits are now cleared both in fd_insert() and _fd_delete_orphan() because after a complete check, it is certain that in both cases these functions are the only ones touching these areas. Indeed, for _fd_delete_orphan(), the thread_mask has already been zeroed before a poller can call fd_update_event() which would touch the state, so it is certain that _fd_delete_orphan() is alone. Regarding fd_insert(), only one thread will get an FD at any moment, and it as this FD has already been released by _fd_delete_orphan() by definition it is certain that previous users have definitely stopped touching it. Strictly speaking there's no need for clearing the state again in fd_insert() but it's cheap and will remove some doubts during some troubleshooting sessions.	2021-04-07 18:04:39 +02:00
Willy Tarreau	8d27c203ed	MEDIUM: fd: prepare FD_POLL_* to move to bits 8-15 In preparation of merging FD_POLL* and FD_EV, this only changes the value of FD_POLL_ to use bits 8-15 (the second byte). The size of the field has been temporarily extended to 32 bits already, as well as the temporary variables that carry the new composite value inside fd_update_events(). The resulting fdtab entry becomes temporarily unaligned. All places making access to .ev or FD_POLL_* were carefully inspected to make sure they were safe regarding this change. Only one temporary update was needed for the "show fd" code. The code was only slightly inflated at this step.	2021-04-07 15:08:40 +02:00
Willy Tarreau	fc0cdfb9b7	CLEANUP: fd: remove FD_POLL_DATA and FD_POLL_STICKY The former was not used and the second was used only as a positive mask of the flags to keep instead of having the flags that are updated. Both were removed in favor of a new FD_POLL_UPDT_MASK that only mentions the updated flags. This will ease merging of state and ev later.	2021-04-07 15:08:40 +02:00
Emeric Brun	9533a70381	MINOR: log: register config file and line number on log servers. This patch registers the parsed file and the line where a log server is declared to make those information available in configuration post check. Those new informations were added on error messages probed resolving ring names on post configuration check.	2021-04-07 09:18:34 +02:00
Amaury Denoyelle	5a6926dcf0	MINOR: diag: create cfgdiag module This module is intended to serve as a placeholder for various diagnostics executed after the configuration file has been fully loaded.	2021-04-01 18:03:37 +02:00
Amaury Denoyelle	7b01a8dbdd	MINOR: global: define diagnostic mode of execution Define MODE_DIAG which is used to run haproxy in diagnostic mode. This mode is used to output extra warnings about possible configuration blunder or sub-optimal usage. It can be activated with argument '-dD'. A new output function ha_diag_warning is implemented reserved for diagnostic output. It serves to standardize the format of diagnostic messages. A macro HA_DIAG_WARN_COND is also available to automatically check if diagnostic mode is on before executing the diagnostic check.	2021-04-01 18:03:37 +02:00
Christopher Faulet	021a8e4d7b	MEDIUM: http-rules: Add wait-for-body action on request and response side Historically, an option was added to wait for the request payload (option http-buffer-request). This option has 2 drawbacks. First, it is an ON/OFF option for the whole proxy. It cannot be enabled on demand depending on the message. Then, as its name suggests, it only works on the request side. The only option to wait for the response payload was to write a dedicated filter. While it is an acceptable solution for complex applications, it is a bit overkill to simply match strings in the body. To make everyone happy, this patch adds a dedicated HTTP action to wait for the message payload, for the request or the response depending it is used in an http-request or an http-response ruleset. The time to wait is configurable and, optionally, the minimum payload size to have before stop to wait. Both the http action and the old http analyzer rely on the same internal function.	2021-04-01 16:27:40 +02:00
Christopher Faulet	581db2b829	MINOR: payload/config: Warn if a L6 sample fetch is used from an HTTP proxy L6 sample fetches are now ignored when called from an HTTP proxy. Thus, a warning is emitted during the startup if such usage is detected. It is true for most ACLs and for log-format strings. Unfortunately, it is a bit painful to do so for sample expressions. This patch relies on the commit "MINOR: action: Use a generic function to check validity of an action rule list".	2021-04-01 15:34:22 +02:00
Christopher Faulet	42c6cf9501	MINOR: action: Use a generic function to check validity of an action rule list The check_action_rules() function is now used to check the validity of an action rule list. It is used from check_config_validity() function to check L5/6/7 rulesets.	2021-04-01 15:34:22 +02:00
Christopher Faulet	3b6446f4d9	MINOR: config/proxy: Don't warn for HTTP rules in TCP if 'switch-mode http' set Warnings about ignored HTTP directives in a TCP proxy are inhibited if at least one switch-mode tcp action is configured to perform HTTP upgraded.	2021-04-01 13:22:42 +02:00
Christopher Faulet	ae863c62e3	MEDIUM: Add tcp-request switch-mode action to perform HTTP upgrade It is now possible to perform HTTP upgrades on a TCP stream from the frontend side. To do so, a tcp-request content rule must be defined with the switch-mode action, specifying the mode (for now, only http is supported) and optionnaly the proto (h1 or h2). This way it could be possible to set HTTP directives on a TCP frontend which will only be evaluated if an upgrade is performed. This new way to perform HTTP upgrades should replace progressively the old way, consisting to route the request to an HTTP backend. And it should be also a good start to remove all HTTP processing from tcp-request content rules. This action is terminal, it stops the ruleset evaluation. It is only available on proxy with the frontend capability. The configuration manual has been updated accordingly.	2021-04-01 13:17:19 +02:00
Christopher Faulet	6c1fd987f6	MINOR: stream: Handle stream HTTP upgrade in a dedicated function The code responsible to perform an HTTP upgrade from a TCP stream is moved in a dedicated function, stream_set_http_mode(). The stream_set_backend() function is slightly updated, especially to correctly set the request analysers.	2021-04-01 11:06:48 +02:00
Christopher Faulet	75f619ad92	MINOR: http-ana: Simplify creation/destruction of HTTP transactions Now allocation and initialization of HTTP transactions are performed in a unique function. Historically, there were two functions because the same TXN was reset for K/A connections in the legacy HTTP mode. Now, in HTX, K/A connections are handled at the mux level. A new stream, and thus a new TXN, is created for each request. In addition, the function responsible to end the TXN is now also reponsible to release it. So, now, http_create_txn() and http_destroy_txn() must be used to create and destroy an HTTP transaction.	2021-04-01 11:06:48 +02:00
Christopher Faulet	bb69d781c8	MINOR: muxes: Show muxes flags when the mux list is displayed When the mux list is displayed on "haproxy -vv" output, the mux flags are now diplayed. The flags meaning may be found in the configuration manual.	2021-04-01 11:06:48 +02:00
Christopher Faulet	a460057f2e	MINOR: muxes: Add a flag to notify a mux does not support any upgrade MX_FL_NO_UPG flag may now be set on a multiplexer to explicitly disable upgrades from this mux. For now, it is set on the FCGI multiplexer because it is not supported and there is no upgrade on backend-only multiplexers. It is also set on the H2 multiplexer because it is clearly not supported.	2021-04-01 11:06:47 +02:00
Willy Tarreau	4bfc6630ba	CLEANUP: socket: replace SOL_IP/IPV6/TCP with IPPROTO_IP/IPV6/TCP Historically we've used SOL_IP/SOL_IPV6/SOL_TCP everywhere as the socket level value in getsockopt() and setsockopt() but as we've seen over time it regularly broke the build and required to have them defined to their IPPROTO_* equivalent. The Linux ip(7) man page says: Using the SOL_IP socket options level isn't portable; BSD-based stacks use the IPPROTO_IP level. And it indeed looks like a pure linuxism inherited from old examples and documentation. strace also reports SOL_* instead of IPPROTO_, which does not help... A check to linux/in.h shows they have the same values. Only SOL_SOCKET and other non-IP values make sense since there is no IPPROTO equivalent. Let's get rid of this annoying confusion by removing all redefinitions of SOL_IP/IPV6/TCP and using IPPROTO_ instead, just like any other operating system. This also removes duplicated tests for the same value. Note that this should not result in exposing syscalls to other OSes as the only ones that were still conditionned to SOL_IPV6 were for IPV6_UNICAST_HOPS which already had an IPPROTO_IPV6 equivalent, and IPV6_TRANSPARENT which is Linux-specific.	2021-03-31 08:59:34 +02:00
Willy Tarreau	be362fd992	MINOR: compat: add short aliases for a few very commonly used types Very often we use "int" where negative numbers are not needed (and can further cause trouble) just because it's painful to type "unsigned int" or "unsigned", or ugly to use in function arguments. Similarly sometimes chars would absolutely need to be signed but nobody types "signed char". Let's add a few aliases for such types and make them part of the standard internal API so that over time we can get used to them and get rid of horrible definitions. A comment also reminds some commonly available types and their properties regarding other types.	2021-03-26 17:54:15 +01:00
Willy Tarreau	2f836de100	MINOR: action: add a new ACT_F_CLI_PARSER origin designation In order to process samples from the command line interface we'll need rules as well, and these rules will have to be marked as coming from the CLI parser. This new origin is used for this.	2021-03-26 16:34:53 +01:00
Willy Tarreau	db5e0dbea9	MINOR: sample: add a new CLI_PARSER context for samples In order to prepare for supporting calling sample expressions from the CLI, let's create a new CLI_PARSER parsing context. This one supports constants and internal samples only.	2021-03-26 16:34:53 +01:00
Willy Tarreau	01d580ae86	MINOR: action: add a new ACT_F_CFG_PARSER origin designation In order to process samples from the config file we'll need rules as well, and these rules will have to be marked as coming from the config parser. This new origin is used for this.	2021-03-26 16:23:45 +01:00
Willy Tarreau	f9a7a8fd8e	MINOR: sample: add a new CFG_PARSER context for samples We'd sometimes like to be able to process samples while parsing the configuration based on purely internal thing but that's not possible right now. Let's add a new CFG_PARSER context for samples which only permits constant samples (i.e. those which do not change in the process' life and which are stable during config parsing).	2021-03-26 16:23:45 +01:00
Willy Tarreau	be2159b946	MINOR: sample: add a new SMP_SRC_CONST sample capability This level indicates that everything it constant in the expression during the whole process' life and that it may safely be used at config parsing time.	2021-03-26 16:23:45 +01:00
Willy Tarreau	77e6a4ef0f	MINOR: sample: make smp_resolve_args() return an allocate error message For now smp_resolve_args() complains on stderr via ha_alert(), but if we want to make it a bit more dynamic, we need it to return errors in an allocated message. Let's pass it an error pointer and have it fill it. On return we indent the output if it contains more than one line.	2021-03-26 16:23:45 +01:00
Amaury Denoyelle	6f26faecd8	MINOR: proxy: define cap PR_CAP_LUA Define a new cap PR_CAP_LUA. It can be used to allocate the internal proxy for lua Socket class. This cap overrides default settings for preferable values in the lua context.	2021-03-26 15:28:33 +01:00
Amaury Denoyelle	27fefa1967	MINOR: proxy: implement a free_proxy function Move all liberation code related to a proxy in a dedicated function free_proxy in proxy.c. For now, this function is only called in haproxy.c. In the future, it will be used to free the lua proxy. This helps to clean up haproxy.c.	2021-03-26 15:28:33 +01:00
Amaury Denoyelle	476b9ad97a	REORG: split proxy allocation functions Create a new function parse_new_proxy specifically designed to allocate a new proxy from the configuration file and copy settings from the default proxy. The function alloc_new_proxy is reduced to a minimal allocation. It is used for default proxy allocation and could also be used for internal proxies such as the lua Socket proxy.	2021-03-26 15:28:33 +01:00
Amaury Denoyelle	68fd7e43d3	REORG: global: move free acl/action in their related source files Move deinit_acl_cond and deinit_act_rules from haproxy.c respectively in acl.c and action.c. The name of the functions has been slightly altered, replacing the prefix deinit_* by free_* to reflect their purpose more clearly. This change has been made in preparation to the implementation of a free proxy function. As a side-effect, it helps to clean up haproxy.c.	2021-03-26 15:28:33 +01:00
Amaury Denoyelle	ce44482fe5	REORG: global: move initcall register code in a dedicated file Create a new module init which contains code related to REGISTER_* macros for initcalls. init.h is included in api.h to make init code available to all modules. It's a step to clean up a bit haproxy.c/global.h.	2021-03-26 15:28:33 +01:00
Ilya Shipitsin	df627943a4	BUILD: ssl: introduce fine guard for ssl random extraction functions SSL_get_{client,server}_random are supported in OpenSSL-1.1.0, BoringSSL, LibreSSL-2.7.0 let us introduce HAVE_SSL_EXTRACT_RANDOM for that purpose	2021-03-26 15:19:07 +01:00
Remi Tricot-Le Breton	8218aed90e	BUG/MINOR: ssl: Fix update of default certificate The default SSL_CTX used by a specific frontend is the one of the first ckch instance created for this frontend. If this instance has SNIs, then the SSL context is linked to the instance through the list of SNIs contained in it. If the instance does not have any SNIs though, then the SSL_CTX is only referenced by the bind_conf structure and the instance itself has no link to it. When trying to update a certificate used by the default instance through a cli command, a new version of the default instance was rebuilt but the default SSL context referenced in the bind_conf structure would not be changed, resulting in a buggy behavior in which depending on the SNI used by the client, he could either use the new version of the updated certificate or the original one. This patch adds a reference to the default SSL context in the default ckch instances so that it can be hot swapped during a certificate update. This should fix GitHub issue #1143. It can be backported as far as 2.2.	2021-03-26 13:06:29 +01:00
Willy Tarreau	6cf13119e2	CLEANUP: fd: remove unused fd_set_running_excl() This one is no longer used and was the origin of the previously mentioned deadlock.	2021-03-24 17:17:21 +01:00
Willy Tarreau	2c3f9818e8	BUG/MEDIUM: fd: do not wait on FD removal in fd_delete() Christopher discovered an issue mostly affecting 2.2 and to a less extent 2.3 and above, which is that it's possible to deadlock a soft-stop when several threads are using a same listener: thread1 thread2 unbind_listener() fd_set_running() lock(listener) listener_accept() fd_delete() lock(listener) while (running_mask); -----> deadlock unlock(listener) This simple case disappeared from 2.3 due to the removal of some locked operations at the end of listener_accept() on the regular path, but the architectural problem is still here and caused by a lock inversion built around the loop on running_mask in fd_clr_running_excl(), because there are situations where the caller of fd_delete() may hold a lock that is preventing other threads from dropping their bit in running_mask. The real need here is to make sure the last user deletes the FD. We have all we need to know the last one, it's the one calling fd_clr_running() last, or entering fd_delete() last, both of which can be summed up as the last one calling fd_clr_running() if fd_delete() calls fd_clr_running() at the end. And we can prevent new threads from appearing in running_mask by removing their bits in thread_mask. So what this patch does is that it sets the running_mask for the thread in fd_delete(), clears the thread_mask, thus marking the FD as orphaned, then clears the running mask again, and completes the deletion if it was the last one. If it was not, another thread will pass through fd_clr_running and will complete the deletion of the FD. The bug is easily reproducible in 2.2 under high connection rates during soft close. When the old process stops its listener, occasionally two threads will deadlock and the old process will then be killed by the watchdog. It's strongly believed that similar situations do exist in 2.3 and 2.4 (e.g. if the removal attempt happens during resume_listener() called from listener_accept()) but if so, they should be much harder to trigger. This should be backported to 2.2 as the issue appeared with the FD migration. It requires previous patches "fd: make fd_clr_running() return the remaining running mask" and "MINOR: fd: remove the unneeded running bit from fd_insert()". Notes for backport: in 2.2, the fd_dodelete() function requires an extra argument "do_close" indicating whether we want to remove and close the FD (fd_delete) or just delete it (fd_remove). While this information is not conveyed along the chain, we know that late calls always imply do_close=1 become do_close=0 exclusively results from fd_remove() which is only used by the config parser and the master, both of which are single-threaded, hence are always the last ones in the running_mask. Thus it is safe to assume that a postponed FD deletion always implies do_close=1. Thanks to Olivier for his help in designing this optimal solution.	2021-03-24 17:17:21 +01:00
Willy Tarreau	71bada5ca4	MINOR: fd: remove the unneeded running bit from fd_insert() There's no point taking the running bit in fd_insert() since by definition there will never be more than one thread inserting the FD, and that fd_insert() may only be done after the fd was allocated by the system, indicating the end of use by any other thread. This will need to be backported to 2.2 to fix an issue.	2021-03-24 17:17:21 +01:00
Willy Tarreau	6e8e10b415	MINOR: fd: make fd_clr_running() return the remaining running mask We'll need to know that a thread is the last one to use an fd, so let's make fd_clr_running() return the remaining bits after removal. Note that in practice we're only interested in knowing if it's zero but the compiler doesn't make use of the clags after the AND and emits a CMPXCHG anyway :-/ This will need to be backported to 2.2 to fix an issue.	2021-03-24 17:17:21 +01:00
Christopher Faulet	cc2c4f8f4c	BUG/MEDIUM: debug/lua: Use internal hlua function to dump the lua traceback The commit reverts following commits: * `83926a04` BUG/MEDIUM: debug/lua: Don't dump the lua stack if not dumpable * `a61789a1` MEDIUM: lua: Use a per-thread counter to track some non-reentrant parts of lua Instead of relying on a Lua function to print the lua traceback into the debugger, we are now using our own internal function (hlua_traceback()). This one does not allocate memory and use a chunk instead. This avoids any issue with a possible deadlock in the memory allocator because the thread processing was interrupted during a memory allocation. This patch relies on the commit "BUG/MEDIUM: debug/lua: Use internal hlua function to dump the lua traceback". Both must be backported wherever the patches above are backported, thus as far as 2.0	2021-03-24 16:35:23 +01:00
Christopher Faulet	d09cc519bd	MINOR: lua: Slightly improve function dumping the lua traceback The separator string is now configurable, passing it as parameter when the function is called. In addition, the message have been slightly changed to be a bit more readable.	2021-03-24 16:33:26 +01:00
Ilya Shipitsin	8cd1627599	CLEANUP: ssl: remove unused definitions not need since `e7eb1fec2f`	2021-03-24 09:52:32 +01:00
Remi Tricot-Le Breton	fb00f31af4	BUG/MINOR: ssl: Prevent disk access when using "add ssl crt-list" If an unknown CA file was first mentioned in an "add ssl crt-list" CLI command, it would result in a call to X509_STORE_load_locations which performs a disk access which is forbidden during runtime. The same would happen if a "ca-verify-file" or "crl-file" was specified. This was due to the fact that the crt-list file parsing and the crt-list related CLI commands parsing use the same functions. The patch simply adds a new parameter to all the ssl_bind parsing functions so that they know if the call is made during init or by the CLI, and the ssl_store_load_locations function can then reject any new cafile_entry creation coming from a CLI call. It can be backported as far as 2.2.	2021-03-23 19:29:46 +01:00
Emeric Brun	69ba35146f	MINOR: tools: introduce new option PA_O_DEFAULT_DGRAM on str2sa_range. str2sa_range function options PA_O_DGRAM and PA_O_STREAM are used to define the supported address types but also to set the default type if it is not explicit. If the used address support both STREAM and DGRAM, the default was always set to STREAM. This patch introduce a new option PA_O_DEFAULT_DGRAM to force the default to DGRAM type if it is not explicit in the address field and both STREAM and DGRAM are supported. If only DGRAM or only STREAM is supported, it continues to be considered as the default.	2021-03-23 15:32:22 +01:00
Willy Tarreau	8cc586c73f	BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable In commit `a1ecbca0a` ("BUG/MINOR: freq_ctr/threads: make use of the last updated global time"), for period-based counters, the millisecond part of the global_now variable was used as the date for the new period. But it's wrong, it only works with sub-second periods as it wraps every second, and for other periods the counters never rotate anymore. Let's make use of the newly introduced global_now_ms variable instead, which contains the global monotonic time expressed in milliseconds. This patch needs to be backported wherever the patch above is backported. It depends on previous commit "MINOR: time: also provide a global, monotonic global_now_ms timer".	2021-03-23 09:03:37 +01:00
Willy Tarreau	6064b34be0	MINOR: time: also provide a global, monotonic global_now_ms timer The period-based freq counters need the global date in milliseconds, so better calculate it and expose it rather than letting all call places incorrectly retrieve it. Here what we do is that we maintain a new globally monotonic timer, global_now_ms, which ought to be very close to the global_now one, but maintains the monotonic approach of now_ms between all threads in that global_now_ms is always ahead of any now_ms. This patch is made simple to ease backporting (it will be needed for a subsequent fix), but it also opens the way to some simplifications on the time handling: instead of computing the local time and trying to force it to the global one, we should soon be able to proceed in the opposite way, that is computing the new global time an making the local one just the latest snapshot of it. This will bring the benefit of making sure that the global time is always ahead of the local one.	2021-03-23 09:01:37 +01:00
Willy Tarreau	5d110b25dd	CLEANUP: connection: use pool_zalloc() in conn_alloc_hash_node() This one used to alloc then zero the area, let's have the allocator do it.	2021-03-22 23:17:24 +01:00
Willy Tarreau	18759079b6	MINOR: pools: add pool_zalloc() to return a zeroed area It's like pool_alloc() but the output is zeroed before being returned and is never poisonned.	2021-03-22 22:05:05 +01:00
Willy Tarreau	de749a9333	MINOR: pools: make the pool allocator support a few flags The pool_alloc_dirty() function was renamed to __pool_alloc() and now takes a set of flags indicating whether poisonning is permitted or not and whether zeroing the area is needed or not. The pool_alloc() function is now just a wrapper calling __pool_alloc(pool, 0).	2021-03-22 20:54:15 +01:00
Willy Tarreau	a213b683f7	CLEANUP: pools: remove the unused pool_get_first() function This one used to maintain a shortcut in the pools allocation path that was only justified by b_alloc_fast() which was not used! Let's get rid of it as well so that the allocator becomes a bit more straight forward.	2021-03-22 16:28:08 +01:00
Willy Tarreau	7be7ffac15	CLEANUP: dynbuf: remove the unused b_alloc_fast() function It is never used anymore since 1.7 where it was used by b_alloc_margin() then replaced by direct calls to the pools function, and it maintains a dependency on the exposed pools functions. It's time to get rid of it, as it's not even certain it still works.	2021-03-22 16:28:05 +01:00
Willy Tarreau	f44ca97fcb	CLEANUP: dynbuf: remove b_alloc_margin() It's not used anymore, let's completely remove it before anyone uses it again by accident.	2021-03-22 16:28:02 +01:00
Willy Tarreau	0f495b3d87	MINOR: channel: simplify the channel's buffer allocation The channel's buffer allocator, channel_alloc_buffer(), was still relying on the principle of a margin for the request and not for the response. But this margin stopped working around 1.7 with the introduction of the content filters such as SPOE, and was completely anihilated with the local pools that came with threads. Let's simplify this and just use b_alloc().	2021-03-22 16:19:45 +01:00
Willy Tarreau	766b6cf206	MINOR: dynbuf: make b_alloc() always check if the buffer is allocated Right now there is a discrepancy beteween b_alloc() and b_allow_margin(): the former forcefully overwrites the target pointer while the latter tests it and returns it as-is if already allocated. As a matter of fact, all callers of b_alloc() either preliminary test the buffer, or assume it's already null. Let's remove this pain and make the function test the buffer's allocation before doing it again, and match call places' expectations.	2021-03-22 16:14:45 +01:00
Christopher Faulet	a61789a1d6	MEDIUM: lua: Use a per-thread counter to track some non-reentrant parts of lua Some parts of the Lua are non-reentrant. We must be sure to carefully track these parts to not dump the lua stack when it is interrupted inside such parts. For now, we only identified the custom lua allocator. If the thread is interrupted during the memory allocation, we must not try to print the lua stack wich also allocate memory. Indeed, realloc() is not async-signal-safe. In this patch we introduce a thread-local counter. It is incremented before entering in a non-reentrant part and decremented when exiting. It is only performed in hlua_alloc() for now.	2021-03-19 16:16:23 +01:00
Olivier Houchard	dae6975498	MINOR: muxes: garbage collect the reset() method. Now that connections aren't being reused when they failed, remove the reset() method. It was unimplemented anywhere, except for H1 where it did nothing, anyway.	2021-03-19 15:33:04 +01:00
Olivier Houchard	1b3c931bff	MEDIUM: connections: Introduce a new XPRT method, start(). Introduce a new XPRT method, start(). The init() method will now only initialize whatever is needed for the XPRT to run, but any action the XPRT has to do before being ready, such as handshakes, will be done in the new start() method. That way, we will be sure the full stack of xprt will be initialized before attempting to do anything. The init() call is also moved to conn_prepare(). There's no longer any reason to wait for the ctrl to be ready, any action will be deferred until start(), anyway. This means conn_xprt_init() is no longer needed.	2021-03-19 15:33:04 +01:00
Amaury Denoyelle	216a1ce3b9	MINOR: stats: export function to allocate extra proxy counters Remove static qualifier on stats_allocate_proxy_counters_internal. This function will be used to allocate extra counters at runtime for dynamic servers.	2021-03-18 15:52:07 +01:00
Amaury Denoyelle	76e10e78bb	MINOR: server: prepare parsing for dynamic servers Prepare the server parsing API to support dynamic servers. - define a new parsing flag to be used for dynamic servers - each keyword contains a new field dynamic_ok to indicate if it can be used for a dynamic server. For now, no keyword are supported. - do not copy settings from the default server for a new dynamic server. - a dynamic server is created in a maintenance mode and requires an explicit 'enable server' command. - a new server flag named SRV_F_DYNAMIC is created. This flag is set for all servers created at runtime. It might be useful later, for example to know if a server can be purged.	2021-03-18 15:51:12 +01:00
Amaury Denoyelle	30c0537f5a	REORG: server: use flags for parse_server Modify the API of parse_server function. Use flags to describe the type of the parsed server instead of discrete arguments. These flags can be used to specify if a server/default-server/server-template is parsed. Additional parameters are also specified (parsing of the address required, resolve of a name must be done immediately). It is now unneeded to use strcmp on args[0] in parse_server. Also, the calls to parse_server are more explicit thanks to the flags.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	828adf0121	REORG: server: add a free server function Create a new server function named free_server. It can be used to deallocate a server and its member.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	18487fb532	MINOR: cli: implement experimental-mode Experimental mode is similar to expert-mode. It can be used to access to features still in development.	2021-03-18 15:37:05 +01:00
Willy Tarreau	6f9f2c0857	MINOR: freq_ctr/threads: relax when failing to update a sliding window value The swrate_add* functions would sping fast on a failed CAS, better place a cpu_relax() call there to reduce contention if any.	2021-03-17 19:36:15 +01:00
Willy Tarreau	a1ecbca0a5	BUG/MINOR: freq_ctr/threads: make use of the last updated global time The freq counters were using the thread's own time as the start of the current period. The problem is that in case of contention, it was occasionally possible to perform non-monotonic updates on the edge of the next second, because if the upfront thread updates a counter first, it causes a rotation, then the second thread loses the race from its older time, and tries again, and detects a different time again, but in the past so it only updates the counter, then a third thread on the new date would detect a change again, thus provoking a rotation again. The effect was triple: - rare loss of stored values during certain transitions from one period to the next one, causing counters to report 0 - half of the threads forced to go through the slow path every second - difficult convergence when using many threads where the CAS can fail a lot and we can observe N(N-1) attempts for N threads to complete This patch fixes this issue in two ways: - first, it now makes use og the monotonic global_now value which also happens to be volatile and to carry the latest known time; this way time will never jump backwards anymore and only the first thread updates it on transition, the other ones do not need to. - second, re-read the time in the loop after each failure, because if the date changed in the counter, it means that one thread knows a more recent one and we need to update. In this case if it matches the new current second, the fast path is usable. This patch relies on previous patch "MINOR: time: export the global_now variable" and must be backported as far as 1.8.	2021-03-17 19:36:15 +01:00
Willy Tarreau	650f374f24	MINOR: time: export the global_now variable This is the process-wide monotonic time that is used to update each thread's own time. It may be required at a few places where a strictly monotonic clock is required such as freq_ctr. It will be have to be backported as a dependency of a forthcoming fix.	2021-03-17 19:25:47 +01:00
Willy Tarreau	31a3cea84f	MINOR: cfgparse/proxy: also support spelling fixes on options Some are not always easy to spot with "chk" vs "check" or hyphens at some places and not at others. Now entering "option http-close" properly suggests "httpclose" and "option tcp-chk" suggests "tcp-check". There's no need to consider the proxy's capabilities, what matters is to figure what related word the user tried to spell, and there are not that many options anyway.	2021-03-15 11:14:57 +01:00
Willy Tarreau	b12bc646d5	MINOR: cli: limit spelling suggestions to 5 There's no need to suggest up to 10 entries for matching keywords, most of the times 5 are plenty, and will be more readable.	2021-03-15 10:40:13 +01:00
Willy Tarreau	9294e8822f	MINOR: tools: improve word fingerprinting by counting presence The distance between two words can be high due to a sub-word being missing and in this case it happens that other totally unrealted words are proposed because their average score looks lower thanks to being shorter. Here we're introducing the notion of presence of each character so that word sequences that contain existing sub-words are favored against the shorter ones having nothing in common. In addition we do not distinguish being/end from a regular delimitor anymore. That made it harder to spot inverted words.	2021-03-15 09:38:42 +01:00
Ilya Shipitsin	f3ede874a5	CLEANUP: assorted typo fixes in the code and comments This is 20th iteration of typo fixes	2021-03-13 11:45:17 +01:00
Willy Tarreau	7416314145	CLEANUP: task: make sure tasklet handlers always indicate their statuses When tasklets were derived from tasks, there was no immediate need for the scheduler to know their status after execution, and in a spirit of simplicity they just started to always return NULL. The problem is that it simply prevents the scheduler from 1) accounting their execution time, and 2) keeping track of their current execution status. Indeed, a remote wake-up could very well end up manipulating a tasklet that's currently being executed. And this is the reason why those handlers have to take the idle lock before checking their context. In 2.5 we'll take care of making tasklets and tasks work more similarly, but trouble is to be expected if we continue to propagate the trend of returning NULL everywhere, especially if some fixes relying on a stricter model later need to be backported. For this reason this patch updates all known tasklet handlers to make them return NULL only when the tasklet was freed. It has no effect for now and isn't even guaranteed to always be 100% safe but it puts the code into the right direction for this.	2021-03-13 11:30:19 +01:00
Willy Tarreau	4975d1482f	CLEANUP: cli: rename the last few "stats_" to "cli_" There were still a very small list of functions, variables and fields called "stats_" while they were really purely CLI-centric. There's the frontend called "stats_fe" in the global section, which instantiates a "cli_applet" called "<CLI>" so it was renamed "cli_fe". The "alloc_stats_fe" function cas renamed to "cli_alloc_fe" which also better matches the naming convention of all cli-specific functions. Finally the "stats_permission_denied_msg" used to return an error on the CLI was renamed "cli_permission_denied_msg". Now there's no more "stats_something" that designates the CLI.	2021-03-13 11:04:35 +01:00
Willy Tarreau	f14c7570d6	CLEANUP: cli: rename MAX_STATS_ARGS to MAX_CLI_ARGS This is the number of args accepted on a command received on the CLI, is has long been totally independent of stats and should not carry this misleading "stats" name anymore.	2021-03-13 10:59:23 +01:00
Willy Tarreau	e33c4b3c11	MINOR: tools: add the ability to update a word fingerprint Instead of making a new one from scratch, let's support not wiping the existing fingerprint and updating it, and to do the same char by char. The word-by-word one will still result in multiple beginnings and ends, but that will accurately translate word boundaries. The char-based one has more flexibility and requires that the caller maintains the previous char to indicate the transition, which also allows to insert delimiters for example.	2021-03-12 19:09:19 +01:00
Willy Tarreau	b736458bfa	MEDIUM: cli: apply spelling fixes for known commands before listing them Entering "show tls" would still emit 35 entries. By measuring the distance between all unknown words and the candidates, we can sort them and pick the 10 most likely candidates. This works reasonably well, as now "show tls" only proposes "show tls-keys", "show threads", "show pools" and "show tasks". If the distance is still too high or if a word is missing, the whole prefix list continues to be dumped, thus "show" alone will still report the entire list of commands beginning with "show". It's still impossible to skip a word, for example "show conn" will not propose "show servers conn" because the distance is calculated for each word individually. Some changes to the distance calculation to support updating an existing map could easily address this. But this is already a great improvement.	2021-03-12 19:09:19 +01:00
Willy Tarreau	4451150251	CLEANUP: cli: fix misleading comment and better indent the access level flags It was mentioned that ACCESS_MASTER_ONLY as for workers only instead of master-only. And it wasn't clear that all ACCESS_* would belong to the same thing.	2021-03-12 19:09:19 +01:00
Christopher Faulet	55c1c4053f	MINOR: resolvers: Use milliseconds for cached items in resolver responses The last time when an item was seen in a resolver responses is now stored in milliseconds instead of seconds. This avoid some corner-cases at the edges. This also simplifies time comparisons.	2021-03-12 17:41:28 +01:00
Christopher Faulet	0efc0993ec	BUG/MEDIUM: resolvers: Don't release resolution from a requester callbacks Another way to say it: "Safely unlink requester from a requester callbacks". Requester callbacks must never try to unlink a requester from a resolution, for the current requester or another one. First, these callback functions are called in a loop on a request list, not necessarily safe. Thus unlink resolution at this place, may be unsafe. And it is useless to try to make these loops safe because, all this stuff is placed in a loop on a resolution list. Unlink a requester may lead to release a resolution if it is the last requester. However, the unkink is necessary because we cannot reset the server state (hostname and IP) with some pending DNS resolution on it. So, to workaround this issue, we introduce the "safe" unlink. It is only performed from a requester callback. In this case, the unlink function never releases the resolution, it only reset it if necessary. And when a resolution is found with an empty requester list, it is released. This patch depends on the following commits : * MINOR: resolvers: Purge answer items when a SRV resolution triggers an error * MINOR: resolvers: Use a function to remove answers attached to a resolution * MINOR: resolvers: Directly call srvrq_update_srv_state() when possible * MINOR: resolvers: Add function to change the srv status based on SRV resolution All the series must be backported as far as 2.2. It fixes a regression introduced by the commit `b4badf720` ("BUG/MINOR: resolvers: new callback to properly handle SRV record errors"). don't release resolution from requester cb	2021-03-12 17:41:28 +01:00
Christopher Faulet	5efdef24c1	MINOR: resolvers: Add function to change the srv status based on SRV resolution srvrq_update_srv_status() update the server status based on result of SRV resolution. For now, it is only used from snr_update_srv_status() when appropriate.	2021-03-12 17:41:28 +01:00
Christopher Faulet	1dec5c7934	MINOR: resolvers: Use a function to remove answers attached to a resolution resolv_purge_resolution_answer_records() must be used to removed all answers attached to a resolution. For now, it is only used when a resolution is released.	2021-03-12 17:41:28 +01:00
Baptiste Assmann	6a8d11dc80	MINOR: resolvers: new function find_srvrq_answer_record() This function search for a SRV answer item associated to a requester whose type is server. This is mainly useful to "link" a server to its SRV record when no additional record were found to configure the IP address. This patch is required by a bug fix.	2021-03-12 17:41:28 +01:00
Willy Tarreau	99eb2cc1cc	MINOR: actions: add a function to suggest an action ressembling a given word action_suggest() will return a pointer to an action whose keyword more or less ressembles the passed argument. It also accepts to be more tolerant against prefixes (since actions taking arguments are handled as prefixes). This will be used to suggest approaching words.	2021-03-12 14:13:21 +01:00
Willy Tarreau	433b05fa64	MINOR: cfgparse/bind: suggest correct spelling for unknown bind keywords Just like with the server keywords, now's the turn of "bind" keywords. The difference is that 100% of the bind keywords are registered, thus we do not need the list of extra keywords. There are multiple bind line parsers today, all were updated: - peers - log - dgram-bind - cli $ printf "listen f\nbind :8000 tcut\n" \| ./haproxy -c -f /dev/stdin [NOTICE] 070/101358 (25146) : haproxy version is 2.4-dev11-7b8787-26 [NOTICE] 070/101358 (25146) : path to executable is ./haproxy [ALERT] 070/101358 (25146) : parsing [/dev/stdin:2] : 'bind :8000' unknown keyword 'tcut'; did you mean 'tcp-ut' maybe ? [ALERT] 070/101358 (25146) : Error(s) found in configuration file : /dev/stdin [ALERT] 070/101358 (25146) : Fatal errors found in configuration.	2021-03-12 14:13:21 +01:00
Willy Tarreau	e2afcc4509	MINOR: cfgparse: add cfg_find_best_match() to suggest an existing word Instead of just reporting "unknown keyword", let's provide a function which will look through a list of registered keywords for a similar-looking word to the one that wasn't matched. This will help callers suggest correct spelling. Also, given that a large part of the config parser still relies on a long chain of strcmp(), we'll need to be able to pass extra candidates. Thus the function supports an optional extra list for this purpose.	2021-03-12 14:13:21 +01:00
Willy Tarreau	ba2c4459a5	MINOR: tools: add simple word fingerprinting to find similar-looking words This introduces two functions, one which creates a fingerprint of a word, and one which computes a distance between two words fingerprints. The fingerprint is made by counting the transitions between one character and another one. Here we consider the 26 alphabetic letters regardless of their case, then any digit as a digit, and anything else as "other". We also consider the first and last locations as transitions from begin to first char, and last char to end. The distance is simply the sum of the squares of the differences between two fingerprints. This way, doubling/ missing a letter has the same cost, however some repeated transitions such as "e"->"r" like in "server" are very unlikely to match against situations where they do not exist. This is a naive approach but it seems to work sufficiently well for now. It may be refined in the future if needed.	2021-03-12 14:13:21 +01:00
Willy Tarreau	133c8c412e	CLEANUP: actions: the keyword must always be const from the rule There's no reason for a rule to want to modify an action keyword, let's make sure it is always const.	2021-03-12 14:13:21 +01:00
Christopher Faulet	77e376783e	BUG/MINOR: proxy/session: Be sure to have a listener to increment its counters It is possible to have a session without a listener. It happens for applets on the client side. Thus all accesses to the listener info from the session must be guarded. It was the purpose of the commit `36119de18` ("BUG/MEDIUM: session: NULL dereference possible when accessing the listener"). However, some tests on the session's listener existence are missing in proxy_inc_* functions. This patch should fix the issues #1171, #1172, #1173, #1174 and #1175. It must be backported with the above commit as far as 1.8.	2021-03-12 09:25:45 +01:00
Willy Tarreau	3b728a92bb	BUILD: atomic/arm64: force the register pairs to use in __ha_cas_dw() Since commit `f8fb4f75f` ("MINOR: atomic: implement a more efficient arm64 __ha_cas_dw() using pairs"), on some modern arm64 (armv8.1+) compiled with -march=armv8.1-a under gcc-7.5.0, a build error may appear on ev_poll.o : /tmp/ccHD2lN8.s:1771: Error: reg pair must start from even reg at operand 1 -- `casp x27,x28,x22,x23,[x12]' Makefile:927: recipe for target 'src/ev_poll.o' failed It appears that the compiler cannot always assign register pairs there for a structure made of two u64. It was possibly later addressed since gcc-9.3 never caused this, but there's no trivially available info on the subject in the changelogs. Unsuprizingly, using a u128 instead does fix this, but it significantly inflates the code (+4kB for just 6 places, very likely that it loaded some extra stubs) and the comparison is ugly, involving two slower conditional jumps instead of a single one and a conditional comparison. For example, ha_random64() grew from 144 bytes to 232. However, simply forcing the base register does work pretty well, and makes the code even cleaner and more efficient by further reducing it by about 4.5kB, possibly because it helps the compiler to pick suitable registers for the pair there. And the perf on 64-cores looks steadily 0.5% above the previous one, so let's do this. Note that the commit above was backported to 2.3 to fix scalability issues on AWS Graviton2 platform, so this one will need to be as well.	2021-03-12 06:26:22 +01:00
Fr�d�ric L�caille	c0ed91910a	BUG/MINOR: connection: Missing QUIC initialization The QUIC connection struct connection member was not initialized. This may make randomly haproxy handle TLS connections as QUIC ones only when QUIC support is enabled leading to such OpenSSL errors (captured from a reg test output, TLS Client-Hello callback failed): OpenSSL error[0x10000085] OPENSSL_internal: CONNECTION_REJECTED OpenSSL error[0x10000410] OPENSSL_internal: SSLV3_ALERT_HANDSHAKE_FAILURE OpenSSL error[0x1000009a] OPENSSL_internal: HANDSHAKE_FAILURE_ON_CLIENT_HELLO This patch should fix #1168 github issue.	2021-03-10 12:21:05 +01:00
Willy Tarreau	060a761248	OPTIM: task: automatically adjust the default runqueue-depth to the threads The recent default runqueue size reduction appeared to have significantly lowered performance on low-thread count configs. Testing various values runqueue values on different workloads under thread counts ranging from 1 to 64, it appeared that lower values are more optimal for high thread counts and conversely. It could even be drawn that the optimal value for various workloads sits around 280/sqrt(nbthread), and probably has to do with both the L3 cache usage and how to optimally interlace the threads' activity to minimize contention. This is much easier to optimally configure, so let's do this by default now.	2021-03-10 11:15:34 +01:00
Daniel Corbett	befef70e23	BUG/MINOR: sample: Rename SenderComID/TargetComID to SenderCompID/TargetCompID The recently introduced Financial Information eXchange (FIX) converters have some hard coded tags based on the specification that were misspelled. Specifically, SenderComID and TargetComID should be SenderCompID and TargetCompID according to the specification [1][2]. This patch updates all references, which includes the converters themselves, the regression test, and the documentation. [1] https://fiximate.fixtrading.org/en/FIX.5.0SP2_EP264/tag49.html [2] https://fiximate.fixtrading.org/en/FIX.5.0SP2_EP264/tag56.html	2021-03-10 10:44:20 +01:00
Emeric Brun	4c75195f5b	BUG/MEDIUM: resolvers: handle huge responses over tcp servers. Parameter "accepted_payload_size" is currently considered regardless the used nameserver is using TCP or UDP. It remains mandatory to annouce such capability to support e-dns, so a value have to be announced also in TCP. Maximum DNS message size in TCP is limited by protocol to 65535 and so for UDP (65507) if system supports such UDP messages. But the maximum value for this option was arbitrary forced to 8192. This patch change this maximum to 65535 to allow user to set bigger value for UDP if its system supports. It also sets accepted_payload_size in TCP allowing to retrieve huge responses if the configuration uses TCP nameservers. The request announcing the accepted_payload_size capability is currently built at resolvers level and is common to all used nameservers of the section regardess transport protocol used. A further patch should be made to at least specify a different payload size depending of the transport, and perhaps could be forced to 65535 in case of TCP and maximum would be forced back to 65507 matching UDP max. This patch is appliable since 2.4 version	2021-03-09 15:44:46 +01:00
Willy Tarreau	e89fae3a4e	CLEANUP: stream: rename a few remaining occurrences of "stream *sess" These are some leftovers from the ancient code where they were still called sessions, but these areas in the code remain confusing due to this naming. They were now called "strm" which will not even affect indenting nor alignment.	2021-03-09 15:44:33 +01:00
Willy Tarreau	c93638e1d1	BUILD: connection: do not use VAR_ARRAY in struct tlv It was brought by commit `c44b8de99` ("CLEANUP: connection: Use `VAR_ARRAY` in `struct tlv` definition") but breaks the build with clang. Actually it had already been done 6 months ago by commit `4987a4744` ("CLEANUP: tree-wide: use VAR_ARRAY instead of [0] in various definitions") then reverted by commit `441b6c31e` ("BUILD: connection: fix build on clang after the VAR_ARRAY cleanup") which explained the same thing but didn't place a comment in the code to justify this (in short it's just an end of struct marker).	2021-03-09 10:15:16 +01:00
Willy Tarreau	018251667e	CLEANUP: config: make the cfg_keyword parsers take a const for the defproxy The default proxy was passed as a variable to all parsers instead of a const, which is not without risk, especially when some timeout parsers used to make some int pointers point to the default values for comparisons. We want to be certain that none of these parsers will modify the defaults sections by accident, so it's important to mark this proxy as const. This patch touches all occurrences found (89).	2021-03-09 10:09:43 +01:00
Willy Tarreau	82a92743fc	BUILD: bug: refine HA_LINK_ERROR() to only be used on gcc and derivatives TCC happens to define __OPTIMIZE__ at -O2 but doesn't proceed with dead code elimination, resulting in ha_free() to always reference the link error symbol. Let's condition this test on __GCC__ which others like Clang also define.	2021-03-09 10:09:43 +01:00
Tim Duesterhus	615f81eb5a	MINOR: connection: Use a `struct ist` to store proxy_authority This makes the code cleaner, because proxy_authority can be handled like proxy_unique_id.	2021-03-09 09:24:32 +01:00
Tim Duesterhus	002bd77a6e	CLEANUP: connection: Use istptr / istlen for proxy_unique_id Don't access the ist's fields directly, use the helper functions instead.	2021-03-09 09:24:32 +01:00
Tim Duesterhus	e004c2beae	CLEANUP: connection: Remove useless test for NULL before calling `pool_free()` `pool_free()` is a noop when the given pointer is NULL. No need to test.	2021-03-09 09:24:32 +01:00
Tim Duesterhus	c44b8de995	CLEANUP: connection: Use `VAR_ARRAY` in `struct tlv` definition This is for consistency with `struct tlv_ssl`.	2021-03-09 09:24:32 +01:00
Olivier Houchard	7b00e31509	BUILD: Fix build when using clang without optimizing. ha_free() uses code that attempts to set a non-existant variable to provoke a link-time error, with the expectation that the compiler will not omit that if the code is unreachable. However, clang will emit it when compiling with no optimization, so only do that if __OPTIMIZE__ is defined.	2021-03-05 16:58:56 +01:00
Willy Tarreau	eef7f7fe68	CLEANUP: server: reorder some fields in the server struct to respect cache lines There's currently quite some thread contention in the server struct because frequently fields accessed fields are mixed with those being often written to by any thread. Let's split this a little bit to separate a few areas: - pure config / admin / operating status (almost never changes) - idle and queuing (fast changes, done almost together) - LB (fast changes, not necessarily dependent on the above) - counters (fast changes, at a different instant again)	2021-03-05 15:00:24 +01:00
Willy Tarreau	d4e78d873c	MINOR: server: move actconns to the per-thread structure The actconns list creates massive contention on low server counts because it's in fact a list of streams using a server, all threads compete on the list's head and it's still possible to see some watchdog panics on 48 threads under extreme contention with 47 threads trying to add and one thread trying to delete. Moving this list per thread is trivial because it's only used by srv_shutdown_streams(), which simply required to iterate over the list. The field was renamed to "streams" as it's really a list of streams rather than a list of connections.	2021-03-05 15:00:24 +01:00
Willy Tarreau	430bf4a483	MINOR: server: allocate a per-thread struct for the per-thread connections stuff There are multiple per-thread lists in the listeners, which isn't the most efficient in terms of cache, and doesn't easily allow to store all the per-thread stuff. Now we introduce an srv_per_thread structure which the servers will have an array of, and place the idle/safe/avail conns tree heads into. Overall this was a fairly mechanical change, and the array is now always initialized for all servers since we'll put more stuff there. It's worth noting that the Lua code still has to deal with its own deinit by itself despite being in a global list, because its server is not dynamically allocated.	2021-03-05 15:00:24 +01:00
Willy Tarreau	198e92a8e5	MINOR: server: add a global list of all known servers It's a real pain not to have access to the list of all registered servers, because whenever there is a need to late adjust their configuration, only those attached to regular proxies are seen, but not the peers, lua, logs nor DNS. What this patch does is that new_server() will automatically add the newly created server to a global list, and it does so as well for the 1 or 2 statically allocated servers created for Lua. This way it will be possible to iterate over all of them.	2021-03-05 15:00:24 +01:00
Willy Tarreau	90e9b8c8b6	CLEANUP: global: reorder some fields to respect cache lines Some entries are atomically updated by various threads, such as the global counters, and they're mixed with others which are read all the time like the mode. This explains why "perf" was seeing a huge access cost on global.mode in process_stream()! Let's reorder them so that the static config stuff is at the beginning and the live stuff is at the end.	2021-03-05 08:30:08 +01:00
Willy Tarreau	cc2672f48b	MINOR: server: don't read curr_used_conns multiple times This one is added atomically and we reread it just after this, causing a second memory load that is visible in the perf profile.	2021-03-05 08:30:08 +01:00
Willy Tarreau	4f8cd4397f	MINOR: xprt: add new xprt_set_idle and xprt_set_used methods These functions are used on the mux layer to indicate that the connection is becoming idle and that the xprt ought to be careful before checking the context or that it's not idle anymore and that the context is safe. The purpose is to allow a mux which is going to release a connection to tell the xprt to be careful when touching it. At the moment, the xprt are always careful and that's costly so we want to have the ability to relax this a bit. No xprt layer uses this yet.	2021-03-05 08:30:08 +01:00
Willy Tarreau	6fa8bcdc78	MINOR: task: add an application specific flag to the state: TASK_F_USR1 This flag will be usable by any application. It will be preserved across wakeups so the application can use it to do various stuff. Some I/O handlers will soon benefit from this.	2021-03-05 08:30:08 +01:00
Willy Tarreau	144f84a09d	MEDIUM: task: extend the state field to 32 bits It's been too short for quite a while now and is now full. It's still time to extend it to 32-bits since we have room for this without wasting any space, so we now gained 16 new bits for future flags. The values were not reassigned just in case there would be a few hidden u16 or short somewhere in which these flags are placed (as it used to be the case with stream->pending_events). The patch is tagged MEDIUM because this required to update the task's process() prototype to use an int instead of a short, that's quite a bunch of places.	2021-03-05 08:30:08 +01:00
Willy Tarreau	e0d5942ddd	MINOR: task: move the nice field to the struct task only The nice field isn't needed anymore for the tasklet so we can move it from the TASK_COMMON area into the struct task which already has a hole around the expire entry.	2021-03-05 08:30:08 +01:00
Willy Tarreau	db4e238938	MINOR: task: stop abusing the nice field to detect a tasklet It's cleaner to use a flag from the task's state to detect a tasklet and it's even cheaper. One of the best benefits is that this will allow to get the nice field out of the common part since the tasklet doesn't need it anymore. This commit uses the last task bit available but that's temporary as the purpose of the change is to extend this.	2021-03-05 08:30:08 +01:00
Willy Tarreau	06e69b556c	REORG: tools: promote the debug PRNG to more general use as a statistical one We frequently need to access a simple and fast PRNG for statistical purposes. The debug_prng() function did exactly this using a xorshift generator but its use was limited to debug only. Let's move this to tools.h and tools.c to make it accessible everywhere. Since it needs to be fast, its state is thread-local. An initialization function starts a different initial value for each thread for better distribution.	2021-03-05 08:30:08 +01:00
Ubuntu	6fa9225628	CLEANUP: stream: explain why we queue the stream at the head of the server list In stream_add_srv_conn() MT_LIST_ADD() is used instead of MT_LIST_ADDQ(), resulting in the stream being queued at the end of the server list. This has no particular effect since we cannot dump the streams on a server, and this is only used by "shutdown sessions" on a server. But it also turns out to be significantly faster due to the shorter recovery from the conflict with an adjacent MT_LIST_DEL(), thus it remains desirable to use it, but at least it deserves a comment. In addition to this, it's worth mentioning that this list should creates extreme contention with threads while almost never used. It should be made per-thread just like the global streams list.	2021-03-05 08:30:08 +01:00
Willy Tarreau	f587003fe9	MINOR: pools: double the local pool cache size to 1 MB The reason is that H2 can already require 32 16kB buffers for the mux output at once, which will deplete the local cache. Thus it makes sense to go further to leave some time to other connection to release theirs. In addition, the L2 cache on modern CPUs is already 1 MB, so this change is welcome in any case.	2021-03-05 08:30:08 +01:00
Willy Tarreau	0bae075928	MEDIUM: pools: add CONFIG_HAP_NO_GLOBAL_POOLS and CONFIG_HAP_GLOBAL_POOLS We've reached a point where the global pools represent a significant bottleneck with threads. On a 64-core machine, the performance was divided by 8 between 32 and 64 H2 connections only because there were not enough entries in the local caches to avoid picking from the global pools, and the contention on the list there was very high. It becomes obvious that we need to have an array of lists, but that will require more changes. In parallel, standard memory allocators have improved, with tcmalloc and jemalloc finding their ways through mainstream systems, and glibc having upgraded to a thread-aware ptmalloc variant, keeping this level of contention here isn't justified anymore when we have both the local per-thread pool caches and a fast process-wide allocator. For these reasons, this patch introduces a new compile time setting CONFIG_HAP_NO_GLOBAL_POOLS which is set by default when threads are enabled with thread local pool caches, and we know we have a fast thread-aware memory allocator (currently set for glibc>=2.26). In this case we entirely bypass the global pool and directly use the standard memory allocator when missing objects from the local pools. It is also possible to force it at compile time when a good allocator is used with another setup. It is still possible to re-enable the global pools using CONFIG_HAP_GLOBAL_POOLS, if a corner case is discovered regarding the operating system's default allocator, or when building with a recent libc but a different allocator which provides other benefits but does not scale well with threads.	2021-03-05 08:30:08 +01:00
Ubuntu	f8fb4f75f1	MINOR: atomic: implement a more efficient arm64 __ha_cas_dw() using pairs There finally is a way to support register pairs on aarch64 assembly under gcc, it's just undocumented, like many of the options there :-( As indicated below, it's possible to pass "%H" to mention the high part of a register pair (e.g. "%H0" to go with "%0"): https://patchwork.ozlabs.org/project/gcc/patch/59368A74.2060908@foss.arm.com/ By making local variables from pairs of registers via a struct (as is used in IST for example), we can let gcc choose the correct register pairs and avoid a few moves in certain situations. The code is now slightly more efficient than the previous one on AWS' Graviton2 platform, and noticeably smaller (by 4.5kB approx). A few tests on older releases show that even Linaro's gcc-4.7 used to support such register pairs and %H, and by then ATOMICS were not supported so this should not cause build issues, and as such this patch replaces the earlier implementation.	2021-03-05 08:30:08 +01:00
Willy Tarreau	46cca86900	MINOR: atomic: add armv8.1-a atomics variant for cas-dw This variant uses the CASP instruction available on armv8.1-a CPU cores, which is detected when __ARM_FEATURE_ATOMICS is set (gcc-linaro >= 7, mainline >= 9). This one was tested on cortex-A55 (S905D3) and on AWS' Graviton2 CPUs. The instruction performs way better on high thread counts since it guarantees some forward progress when facing extreme contention while the original LL/SC approach is light on low-thread counts but doesn't guarantee progress. The implementation is not the most optimal possible. In particular since the instruction requires to work on register pairs and there doesn't seem to be a way to force gcc to emit register pairs, we have to decide to force to use the pair (x0,x1) to store the old value, and (x2,x3) to store the new one, and this necessarily involves some extra moves. But at least it does improve the situation with 16 threads and more. See issue #958 for more context. Note, a first implementation of this function was making use of an input/output constraint passed using "+Q"((void*)target), which was resulting in smaller overall code than passing "target" as an input register only. It turned out that the cause was directly related to whether the function was inlined or not, hence the "forceinline" attribute. Any changes to this code should still pay attention to this important factor.	2021-03-05 08:30:08 +01:00
Willy Tarreau	168fc5332c	BUG/MINOR: mt-list: always perform a cpu_relax call on failure On highly threaded machines it is possible to occasionally trigger the watchdog on certain contended areas like the server's connection list, because while the mechanism inherently cannot guarantee a constant progress, it lacks CPU relax calls which are absolutely necessary in this situation to let a thread finish its job. The loop's "while (1)" was changed to use a "for" statement calling __ha_cpu_relax() as its continuation expression. This way the "continue" statements jump to the unique place containing the pause without excessively inflating the code. This was sufficient to definitely fix the problem on 64-core ARM Graviton2 machines. This patch should probably be backported once it's confirmed it also helps on many-cores x86 machines since some people are facing contention in these environments. This patch depends on previous commit "REORG: atomic: reimplement pl_cpu_relax() from atomic-ops.h". An attempt was made to first read the value before exchanging, and it significantly degraded the performance. It's very likely that this caused other cores to lose exclusive ownership on their line and slow down their next xchg operation. In addition it was found that MT_LIST_ADD is significantly faster than MT_LIST_ADDQ under high contention, because it fails one step earlier when conflicting with an adjacent MT_LIST_DEL(). It might be worth switching some operations' order to favor MT_LIST_ADDQ() instead.	2021-03-05 08:30:08 +01:00
Willy Tarreau	958ae26c35	REORG: atomic: reimplement pl_cpu_relax() from atomic-ops.h There is some confusion here as we need to place some cpu_relax statements in some loops where it's not easily possible to condition them on the use of threads. That's what atomic.h already does. So let's take the various pl_cpu_relax() implementations from there and place them in atomic.h under the name __ha_cpu_relax() and let them adapt to the presence or absence of threads and to the architecture (currently only x86 and aarch64 use a barrier instruction), though it's very likely that arm would work well with a cache flushing ISB instruction as well). This time they were implemented as expressions returning 1 rather than statements, in order to ease their placement as the loop condition or the continuation expression inside "for" loops. We should probably do the same with barriers and a few such other ones.	2021-03-05 08:30:08 +01:00
Amaury Denoyelle	8ede3db080	MINOR: backend: handle reuse for conns with no server as target If dispatch mode or transparent backend is used, the backend connection target is a proxy instead of a server. In these cases, the reuse of backend connections is not consistent. With the default behavior, no reuse is done and every new request uses a new connection. However, if http-reuse is set to never, the connection are stored by the mux in the session and can be reused for future requests in the same session. As no server is used for these connections, no reuse can be made outside of the session, similarly to http-reuse never mode. A different http-reuse config value should not have an impact. To achieve this, mark these connections as private to have a defined behavior. For this feature to properly work, the connection hash has been slightly adjusted. The server pointer as an input as been replaced by a generic target pointer to refer to the server or proxy instance. The hash is always calculated on connect_server even if the connection target is not a server. This also requires to allocate the connection hash node for every backend connections, not just the one with a server target.	2021-03-03 11:31:19 +01:00
Frédéric Lécaille	b28812af7a	BUILD: quic: Implicit conversion between SSL related enums. Fix such compilation issues: include/haproxy/quic_tls.h:157:10: error: implicit conversion from 'enum ssl_encryption_level_t' to 'enum quic_tls_enc_level' [-Werror=enum-conversion] 157 \| return ssl_encryption_application; \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ src/xprt_quic.c: In function 'quic_conn_enc_level_init': src/xprt_quic.c:2358:13: error: implicit conversion from 'enum quic_tls_enc_level' to 'enum ssl_encryption_level_t' [-Werror=enum-conversion] 2358 \| qel->level = quic_to_ssl_enc_level(level); \| ^ Not detected by all the compilators.	2021-03-02 10:34:18 +01:00
Willy Tarreau	61cfdf4fd8	CLEANUP: tree-wide: replace free(x);x=NULL with ha_free(&x) This makes the code more readable and less prone to copy-paste errors. In addition, it allows to place some __builtin_constant_p() predicates to trigger a link-time error in case the compiler knows that the freed area is constant. It will also produce compile-time error if trying to free something that is not a regular pointer (e.g. a function). The DEBUG_MEM_STATS macro now also defines an instance for ha_free() so that all these calls can be checked. 178 occurrences were converted. The vast majority of them were handled by the following Coccinelle script, some slightly refined to better deal with "&*x" or with long lines: @ rule @ expression E; @@ - free(E); - E = NULL; + ha_free(&E); It was verified that the resulting code is the same, more or less a handful of cases where the compiler optimized slightly differently the temporary variable that holds the copy of the pointer. A non-negligible amount of {free(str);str=NULL;str_len=0;} are still present in the config part (mostly header names in proxies). These ones should also be cleaned for the same reasons, and probably be turned into ist strings.	2021-02-26 21:21:09 +01:00
Christopher Faulet	29e9326f2f	CLEANUP: hlua: Use net_addr structure internally to parse and compare addresses hlua_addr structure may be replaced by net_addr structure to parse and compare addresses. Both structures are similar.	2021-02-26 13:53:26 +01:00
Christopher Faulet	5d1def623a	MEDIUM: http-ana: Add IPv6 support for forwardfor and orignialto options A network may be specified to avoid header addition for "forwardfor" and "orignialto" option via the "except" parameter. However, only IPv4 networks/addresses are supported. This patch adds the support of IPv6. To do so, the net_addr structure is used to store the parameter value in the proxy structure. And ipcmp2net() function is used to perform the comparison. This patch should fix the issue #1145. It depends on the following commit: * c6ce0ab MINOR: tools: Add function to compare an address to a network address * 5587287 MINOR: tools: Add net_addr structure describing a network addess	2021-02-26 13:52:48 +01:00
Christopher Faulet	9553de7fec	MINOR: tools: Add function to compare an address to a network address ipcmp2net() function may be used to compare an addres (struct sockaddr_storage) to a network address (struct net_addr). Among other things, this function will be used to add support of IPv6 for "except" parameter of "forwardfor" and "originalto" options.	2021-02-26 13:52:06 +01:00
Christopher Faulet	01f02a4d84	MINOR: tools: Add net_addr structure describing a network addess The net_addr structure describes a IPv4 or IPv6 address. Its ip and mask are represented. Among other things, this structure will be used to add support of IPv6 for "except" parameter of "forwardfor" and "originalto" options.	2021-02-26 13:32:17 +01:00
Willy Tarreau	401135cee6	MINOR: task: add one extra tasklet class: TL_HEAVY This class will be used exclusively for heavy processing tasklets. It will be cleaner than mixing them with the bulk ones. For now it's allocated ~1% of the CPU bandwidth. The largest part of the patch consists in re-arranging the fields in the task_per_thread structure to preserve a clean alignment with one more list head. Since we're now forced to increase the struct past a second cache line, it now uses 4 cache lines (for easy multiplying) with the first two ones being exclusively used by local operations and the third one mostly by atomic operations. Interestingly, this better arrangement causes less stress and reduced the response time by 8 microseconds at 1 million requests per second.	2021-02-26 12:00:53 +01:00
Willy Tarreau	d8aa21a611	CLEANUP: server: rename srv_cleanup_{idle,toremove}_connections() These function names are unbearably long, they don't even fit into the screen in "show profiling", let's trim the "_connections" to "_conns", which happens to match the name of the lists there.	2021-02-26 00:30:22 +01:00
Willy Tarreau	74dea8caea	MINOR: task: limit the number of subsequent heavy tasks with flag TASK_HEAVY While the scheduler is priority-aware and class-aware, and consistently tries to maintain fairness between all classes, it doesn't make use of a fine execution budget to compensate for high-latency tasks such as TLS handshakes. This can result in many subsequent calls adding multiple milliseconds of latency between the various steps of other tasklets that don't even depend on this. An ideal solution would be to add a 4th queue, have all tasks announce their estimated cost upfront and let the scheduler maintain an auto- refilling budget to pick from the most suitable queue. But it turns out that a very simplified version of this already provides impressive gains with very tiny changes and could easily be backported. The principle is to reserve a new task flag "TASK_HEAVY" that indicates that a task is expected to take a lot of time without yielding (e.g. an SSL handshake typically takes 700 microseconds of crypto computation). When the scheduler sees this flag when queuing a tasklet, it will place it into the bulk queue. And during dequeuing, we accept only one of these in a full round. This means that the first one will be accepted, will not prevent other lower priority tasks from running, but if a new one arrives, then the queue stops here and goes back to the polling. This will allow to collect more important updates for other tasks that will be batched before the next call of a heavy task. Preliminary tests consisting in placing this flag on the SSL handshake tasklet show that response times under SSL stress fell from 14 ms before the patch to 3.0 ms with the patch, and even 1.8 ms if tune.sched.low-latency is set to "on".	2021-02-26 00:25:51 +01:00
Christopher Faulet	69beaa91d5	REORG: server: Export and rename some functions updating server info Some static functions are now exported and renamed to follow the same pattern of other exported functions. Here is the list : * update_server_fqdn: Renamed to srv_update_fqdn and exported * update_server_check_addr_port: renamed to srv_update_check_addr_port and exported * update_server_agent_addr_port: renamed to srv_update_agent_addr_port and exported * update_server_addr: renamed to srv_update_addr * update_server_addr_potr: renamed to srv_update_addr_port * srv_prepare_for_resolution: exported This change is mandatory to move all functions dealing with the server-state files in a separate file.	2021-02-25 10:02:39 +01:00
Christopher Faulet	ecfb9b9109	MEDIUM: server: Store parsed params of a server-state line in the tree Parsed parameters are now stored in the tree of server-state lines. This way, a line from the global server-state file is only parsed once. Before, it was parsed a first time to store it in the tree and one more time to load the server state. To do so, the server-state line object must be allocated before parsing a line. This means its size must no longer depend on the length of first parsed parameters (backend and server names). Thus the node type was changed to use a hashed key instead of a string.	2021-02-25 10:02:39 +01:00
Christopher Faulet	6d87c58fb4	CLEANUP: server: Rename state_line structure into server_state_line The structure used to store a server-state line in an eb-tree has a too generic name. Instead of state_line, the structure is renamed as server_state_line.	2021-02-25 10:02:39 +01:00
Christopher Faulet	fcb53fbb58	CLEANUP: server: Rename state_line node to node instead of name_name <state_line.name_name> field is a node in an eb-tree. Thus, instead of "name_name", we now use "node" to name this field. If is a more explicit name and not too strange.	2021-02-25 10:02:39 +01:00
Willy Tarreau	b2285de049	MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set It is extremely useful to be able to observe the wakeup latency of some important I/O operations, so let's accept to inflate the tasklet struct by 8 extra bytes when DEBUG_TASK is set. With just this we have enough to get live reports like this: $ socat - /tmp/sock1 <<< "show profiling" Per-task CPU profiling : on # set profiling tasks {on\|auto\|off} Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg si_cs_io_cb 8099492 4.833s 596.0ns 8.974m 66.48us h1_io_cb 7460365 11.55s 1.548us 2.477m 19.92us process_stream 7383828 22.79s 3.086us 18.39m 149.5us h1_timeout_task 4157 - - 348.4ms 83.81us srv_cleanup_toremove_connections751 39.70ms 52.86us 10.54ms 14.04us srv_cleanup_idle_connections 21 1.405ms 66.89us 30.82us 1.467us task_run_applet 16 1.058ms 66.13us 446.2us 27.89us accept_queue_process 7 34.53us 4.933us 333.1us 47.58us	2021-02-25 09:44:16 +01:00
Willy Tarreau	45499c56d3	MINOR: task: make grq_total atomic to move it outside of the grq_lock Instead of decrementing grq_total once per task picked from the global run queue, let's do it at once after the loop like we do for other counters. This simplifies the code everywhere. It is not expected to bring noticeable improvements however, since global tasks tend to be less common nowadays.	2021-02-25 09:44:16 +01:00
Willy Tarreau	c03fbeb358	CLEANUP: task: re-merge __task_unlink_rq() with task_unlink_rq() There's no point keeping the two separate anymore, some tests are duplicated for no reason.	2021-02-25 09:44:16 +01:00
Christopher Faulet	e071f0e6a4	MINOR: htx: Add function to reserve the max possible size for an HTX DATA block The function htx_reserve_max_data() should be used to get an HTX DATA block with the max possible size. A current block may be extended or a new one created, depending on the HTX message state. But the idea is to let the caller to copy a bunch of data without requesting many new blocks. It is its responsibility to resize the block at the end, to set the final block size. This function will be used to parse messages with small chunks. Indeed, we can have more than 2700 1-byte chunks in a 16Kb of input data. So it is easy to understand how this function may help to improve the parsing of chunk messages.	2021-02-24 22:10:01 +01:00
Baptiste Assmann	b4badf720c	BUG/MINOR: resolvers: new callback to properly handle SRV record errors When a SRV record was created, it used to register the regular server name resolution callbacks. That said, SRV records and regular server name resolution don't work the same way, furthermore on error management. This patch introduces a new call back to manage DNS errors related to the SRV queries. this fixes github issue #50. Backport status: 2.3, 2.2, 2.1, 2.0	2021-02-24 21:58:45 +01:00
Willy Tarreau	5926e384e6	BUG/MINOR: fd: properly wait for !running_mask in fd_set_running_excl() In fd_set_running_excl() we don't reset the old mask in the CAS loop, so if we fail on the first round, we'll forcefully take the FD on the next one. In practice it's used bu fd_insert() and fd_delete() only, none of which is supposed to be passed an FD which is still in use since in practice, given that for now only listeners may be enabled on multiple threads at once. This can be backported to 2.2 but shouldn't result in fixing any user visible bug for now.	2021-02-24 19:40:49 +01:00
Willy Tarreau	9c6dbf0eea	CLEANUP: task: split the large tasklet_wakeup_on() function in two This function has become large with the multi-queue scheduler. We need to keep the fast path and the debugging parts inlined, but the rest now moves to task.c just like was done for task_wakeup(). This has reduced the code size by 6kB due to less inlining of large parts that are always context-dependent, and as a side effect, has increased the overall performance by 1%.	2021-02-24 17:55:58 +01:00
Willy Tarreau	955a11ebfa	MINOR: task: move the allocated tasks counter to the per-thread struct The nb_tasks counter was still global and gets incremented and decremented for each task_new()/task_free(), and was read in process_runnable_tasks(). But it's only used for stats reporting, so doing this this often is pointless and expensive. Let's move it to the task_per_thread struct and have the stats sum it when needed.	2021-02-24 17:42:04 +01:00
Willy Tarreau	018564eaa2	CLEANUP: task: move the tree root detection from __task_wakeup() to task_wakeup() Historically we used to call __task_wakeup() with a known tree root but this is not the case and the code has remained needlessly complicated with the root calculation in task_wakeup() passed in argument to __task_wakeup() which compares it again. Let's get rid of this and just move the detection code there. This eliminates some ifdefs and allows to simplify the test conditions quite a bit.	2021-02-24 17:42:04 +01:00
Willy Tarreau	1f3b1417b8	CLEANUP: tasks: use a less confusing name for task_list_size This one is systematically misunderstood due to its unclear name. It is in fact the number of tasks in the local tasklet list. Let's call it "tasks_in_list" to remove some of the confusion.	2021-02-24 17:42:04 +01:00
Willy Tarreau	2c41d77ebc	MINOR: tasks: do not maintain the rqueue_size counter anymore This one is exclusively used as a boolean nowadays and is non-zero only when the thread-local run queue is not empty. Better check the root tree's pointer and avoid updating this counter all the time.	2021-02-24 17:42:04 +01:00
Willy Tarreau	9c7b8085f4	MEDIUM: task: remove the tasks_run_queue counter and have one per thread This counter is solely used for reporting in the stats and is the hottest thread contention point to date. Moving it to the scheduler and having a separate one for the global run queue dramatically improves the performance, showing a 12% boost on the request rate on 16 threads! In addition, the thread debugging output which used to rely on rqueue_size was not totally accurate as it would only report task counts. Now we can return the exact thread's run queue length. It is also interesting to note that there are still a few other task/tasklet counters in the scheduler that are not efficiently updated because some cover a single area and others cover multiple areas. It looks like having a distinct counter for each of the following entries would help and would keep the code a bit cleaner: - global run queue (tree) - per-thread run queue (tree) - per-thread shared tasklets list - per-thread local lists Maybe even splitting the shared tasklets lists between pure tasklets and tasks instead of having the whole and tasks would simplify the code because there remain a number of places where several counters have to be updated.	2021-02-24 17:42:04 +01:00
Willy Tarreau	49de68520e	MEDIUM: streams: do not use the streams lock anymore The lock was still used exclusively to deal with the concurrency between the "show sess" release handler and a stream_new() or stream_free() on another thread. All other accesses made by "show sess" are already done under thread isolation. The release handler only requires to unlink its node when stopping in the middle of a dump (error, timeout etc). Let's just isolate the thread to deal with this case so that it's compatible with the dump conditions, and remove all remaining locking on the streams. This effectively kills the streams lock. The measured gain here is around 1.6% with 4 threads (374krps -> 380k).	2021-02-24 13:54:50 +01:00
Willy Tarreau	a698eb6739	MINOR: streams: use one list per stream instead of a global one The global streams list is exclusively used for "show sess", to look up a stream to shut down, and for the hard-stop. Having all of them in a single list is extremely expensive in terms of locking when using threads, with performance losses as high as 7% having been observed just due to this. This patch makes the list per-thread, since there's no need to have a global one in this situation. All call places just iterate over all threads. The most "invasive" changes was in "show sess" where the end of list needs to go back to the beginning of next thread's list until the last thread is seen. For now the lock was maintained to keep the code auditable but a next commit should get rid of it. The observed performance gain here with only 4 threads is already 7% (350krps -> 374krps).	2021-02-24 13:53:20 +01:00
Willy Tarreau	b981318c11	MINOR: stream: add an "epoch" to figure which streams appeared when The "show sess" CLI command currently lists all streams and needs to stop at a given position to avoid dumping forever. Since 2.2 with commit `c6e7a1b8e` ("MINOR: cli: make "show sess" stop at the last known session"), a hack consists in unlinking the stream running the applet and linking it again at the current end of the list, in order to serve as a delimiter. But this forces the stream list to be global, which affects scalability. This patch introduces an epoch, which is a global 32-bit counter that is incremented by the "show sess" command, and which is copied by newly created streams. This way any stream can know whether any other one is newer or older than itself. For now it's only stored and not exploited.	2021-02-24 12:12:51 +01:00
Ilya Shipitsin	98a9e1b873	BUILD: SSL: introduce fine guard for RAND_keep_random_devices_open RAND_keep_random_devices_open is OpenSSL specific function, not implemented in LibreSSL and BoringSSL. Let us define guard HAVE_SSL_RAND_KEEP_RANDOM_DEVICES_OPEN in include/haproxy/openssl-compat.h That guard does not depend anymore on HA_OPENSSL_VERSION	2021-02-22 10:35:23 +01:00
Willy Tarreau	c6ba9a0b9b	MINOR: sched: have one runqueue ticks counter per thread The runqueue_ticks counts the number of task wakeups and is used to position new tasks in the run queue, but since we've had per-thread run queues, the values there are not very relevant anymore and the nice value doesn't apply well if some threads are more loaded than others. In addition, letting all threads compete over a shared counter is not smart as this may cause some excessive contention. Let's move this index close to the run queues themselves, i.e. one per thread and a global one. In addition to improving fairness, this has increased global performance by 2% on 16 threads thanks to the lower contention on rqueue_ticks. Fairness issues were not observed, but if any were to be, this patch could be backported as far as 2.0 to address them.	2021-02-20 13:03:37 +01:00
Willy Tarreau	4d77bbf856	MINOR: dynbuf: pass offer_buffers() the number of buffers instead of a threshold Historically this function would try to wake the most accurate number of process_stream() waiters. But since the introduction of filters which could also require buffers (e.g. for compression), things started not to be as accurate anymore. Nowadays muxes and transport layers also use buffers, so the runqueue size has nothing to do anymore with the number of supposed users to come. In addition to this, the threshold was compared to the number of free buffer calculated as allocated minus used, but this didn't work anymore with local pools since these counts are not updated upon alloc/free! Let's clean this up and pass the number of released buffers instead, and consider that each waiter successfully called counts as one buffer. This is not rocket science and will not suddenly fix everything, but at least it cannot be as wrong as it is today. This could have been marked as a bug given that the current situation is totally broken regarding this, but this probably doesn't completely fix it, it only goes in a better direction. It is possible however that it makes sense in the future to backport this as part of a larger series if the situation significantly improves.	2021-02-20 12:38:18 +01:00
Willy Tarreau	90f366b595	MINOR: dynbuf: use regular lists instead of mt_lists for buffer_wait There's no point anymore in keeping mt_lists for the buffer_wait and buffer_wq since it's thread-local now.	2021-02-20 12:38:18 +01:00
Willy Tarreau	e8e5091510	MINOR: dynbuf: make the buffer wait queue per thread The buffer wait queue used to be global historically but this doest not make any sense anymore given that the most common use case is to have thread-local pools. Thus there's no point waking up waiters of other threads after releasing an entry, as they won't benefit from it. Let's move the queue head to the thread_info structure and use ti->buffer_wq from now on.	2021-02-20 12:38:18 +01:00
Christopher Faulet	ea2cdf55e3	MEDIUM: server: Don't introduce a new server-state file version This revert the commit `63e6cba12` ("MEDIUM: server: add server-states version 2"), but keeping all recent features added to the server-sate file. Instead of adding a 2nd version for the server-state file format to handle the 5 new fields added during the 2.4 development, these fields are considered as optionnal during the parsing. So it is possible to load a server-state file from HAProxy 2.3. However, from 2.4, these new fields are always dumped in the server-state file. But it should not be a problem to load it on the 2.3. This patch seems a bit huge but the diff ignoring the space is much smaller. The version 2 of the server-state file format is reserved for a real refactoring to address all issues of the current format.	2021-02-19 18:03:59 +01:00
Amaury Denoyelle	8990b010a0	MINOR: connection: allocate dynamically hash node for backend conns Remove ebmb_node entry from struct connection and create a dedicated struct conn_hash_node. struct connection contains now only a pointer to a conn_hash_node, allocated only for connections where target is of type OBJ_TYPE_SERVER. This will reduce memory footprints for every connections that does not need http-reuse such as frontend connections.	2021-02-19 16:59:18 +01:00
Olivier Houchard	5567f41d0a	BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ(). In MT_LIST_TRY_ADDQ(), deal with the "prev" field of the element before the "next". If the element is the first in the list, then its next will already have been locked when we locked list->prev->next, so locking it again will fail, and we'll start over and over. This should be backported to 2.3.	2021-02-19 16:47:20 +01:00
Willy Tarreau	66161326fd	MINOR: listener: refine the default MAX_ACCEPT from 64 to 4 The maximum number of connections accepted at once by a thread for a single listener used to default to 64 divided by the number of processes but the tasklet-based model is much more scalable and benefits from smaller values. Experimentation has shown that 4 gives the highest accept rate for all thread values, and that 3 and 5 come very close, as shown below (HTTP/1 connections forwarded per second at multi-accept 4 and 64): ac\thr\| 1 2 4 8 16 ------+------------------------------ 4\| 80k 106k 168k 270k 336k 64\| 63k 89k 145k 230k 274k Some tests were also conducted on SSL and absolutely no change was observed. The value was placed into a define because it used to be spread all over the code. It might be useful at some point to backport this to 2.3 and 2.2 to help those who observed some performance regressions from 1.6.	2021-02-19 16:02:04 +01:00
Willy Tarreau	4327d0ac00	MINOR: tasks: refine the default run queue depth Since a lot of internal callbacks were turned to tasklets, the runqueue depth had not been readjusted from the default 200 which was initially used to favor batched processing. But nowadays it appears too large already based on the following tests conducted on a 8c16t machine with a simple config involving "balance leastconn" and one server. The setup always involved the two threads of a same CPU core except for 1 thread, and the client was running over 1000 concurrent H1 connections. The number of requests per second is reported for each (runqueue-depth, nbthread) couple: rq\thr\| 1 2 4 8 16 ------+------------------------------ 32\| 120k 159k 276k 477k 698k 40\| 122k 160k 276k 478k 722k 48\| 121k 159k 274k 482k 720k 64\| 121k 160k 274k 469k 710k 200\| 114k 150k 247k 415k 613k <-- default It's possible to save up to about 18% performance by lowering the default value to 40. One possible explanation to this is that checking I/Os more frequently allows to flush buffers faster and to smooth the I/O wait time over multiple operations instead of alternating phases of processing, waiting for locks and waiting for new I/Os. The total round trip time also fell from 1.62ms to 1.40ms on average, among which at least 0.5ms is attributed to the testing tools since this is the minimum attainable on the loopback. After some observation it would be nice to backport this to 2.3 and 2.2 which observe similar improvements, since some users have already observed some perf regressions between 1.6 and 2.2.	2021-02-19 16:01:55 +01:00
Ilya Shipitsin	c47d676bd7	BUILD: ssl: introduce fine guard for OpenSSL specific SCTL functions SCTL (signed certificate timestamp list) specified in RFC6962 was implemented in c74ce24cd22e8c683ba0e5353c0762f8616e597d, let us introduce macro HAVE_SSL_SCTL for the HAVE_SSL_SCTL sake, which in turn is based on SN_ct_cert_scts, which comes in the same commit	2021-02-18 15:55:50 +01:00
Christopher Faulet	8dd40fbde9	BUG/MINOR: sample: Always consider zero size string samples as unsafe smp_is_safe() function is used to be sure a sample may be safely modified. For string samples, a test is performed to verify if there is a null-terminated byte. If not, one is added, if possible. It means if the sample is not const and if there is some free space in the buffer, after data. However, we must not try to read the null-terminated byte if the string sample is too long (data >= size) or if the size is equal to zero. This last test was not performed. Thus it was possible to consider a string sample as safe by testing a byte outside the buffer. Now, a zero size string sample is always considered as unsafe and is duplicated when smp_make_safe() is called. This patch must be backported in all stable versions.	2021-02-18 14:58:43 +01:00
Willy Tarreau	ca9f60c1ac	MINOR: tasks/debug: add some extra controls of use-after-free in DEBUG_TASK It's pretty easy to pre-initialize the index, change it on free() and check it during the wakeup, so let's do this to ease detection of any accidental task_wakeup() after a task_free() or tasklet_wakeup() after a tasklet_free(). If this would ever happen we'd then get a backtrace and a core now. The index's parity is respected so that the call history remains exploitable.	2021-02-18 14:38:49 +01:00
Willy Tarreau	b23f04260b	MINOR: tasks: add DEBUG_TASK to report caller info in a task The idea is to know who woke a task up, by recording the last two callers in a rotating mode. For now it's trivial with task_wakeup() but tasklet_wakeup_on() will require quite some more changes. This typically gives this from the debugger: (gdb) p t->debug $2 = { caller_file = {0x0, 0x8c0d80 "src/task.c"}, caller_line = {0, 260}, caller_idx = 1 } or this: (gdb) p t->debug $6 = { caller_file = {0x7fffe40329e0 "", 0x885feb "src/stream.c"}, caller_line = {284, 284}, caller_idx = 1 } But it also provides a trivial macro allowing to simply place a call in a task/tasklet handler that needs to be observed: DEBUG_TASK_PRINT_CALLER(t); Then starting haproxy this way would trivially yield such info: $ ./haproxy -db -f test.cfg \| sort \| uniq -c \| sort -nr 199992 h1_io_cb woken up from src/sock.c:797 51764 h1_io_cb woken up from src/mux_h1.c:3634 65 h1_io_cb woken up from src/connection.c:169 45 h1_io_cb woken up from src/sock.c:777	2021-02-18 10:42:07 +01:00
Willy Tarreau	59b0fecfd9	MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock The two algos defining these functions (first and leastconn) do not need the server's lock. However it's already present in pendconn_process_next_strm() so the API must be updated so that the functions may take it if needed and that the callers indicate whether they already own it. As such, the call places (backend.c and stream.c) now do not take it anymore, queue.c was unchanged since it's already held, and both "first" and "leastconn" were updated to take it if not already held. A quick test on the "first" algo showed a jump from 432 to 565k rps by just dropping the lock in stream.c!	2021-02-18 10:06:45 +01:00
Willy Tarreau	b9ad30a8ad	Revert "MINOR: threads: change lock_t to an unsigned int" This reverts commit `8f1f177ed0`. Repeated tests have shown a small perforamnce degradation of ~1.8% caused by this patch at high request rates on 16 threads. The exact cause is not yet perfectly known but it probably stems in slower accesses for non-64-bit aligned atomic accesses.	2021-02-18 10:06:45 +01:00
Willy Tarreau	751153e0f1	OPTIM: server: switch the actconn list to an mt-list The remaining contention on the server lock solely comes from sess_change_server() which takes the lock to add and remove a stream from the server's actconn list. This is both expensive and pointless since we have mt-lists, and this list is only used by the CLI's "shutdown server sessions" command! Let's migrate to an mt-list and remove the need for this costly lock. By doing so, the request rate increased by ~1.8%.	2021-02-18 10:06:45 +01:00
Willy Tarreau	ccea3c54f4	DEBUG: thread: add 5 extra lock labels for statistics and debugging Since OTHER_LOCK is commonly used it's become much more difficult to profile lock contention by temporarily changing a lock label. Let's add DEBUG1..5 to serve only for debugging. These ones must not be used in committed code. We could decide to only define them when DEBUG_THREAD is set but that would complicate attempts at measuring performance with debugging turned off.	2021-02-18 10:06:45 +01:00
Willy Tarreau	4e9df2737d	BUG/MEDIUM: checks: don't needlessly take the server lock in health_adjust() The server lock was taken preventively for anything in health_adjust(), including the static config checks needed to detect that the lock was not needed, while the function is always called on the response path to update a server's status. This was responsible for huge contention causing a performance drop of about 17% on 16 threads. Let's move the lock only where it should be, i.e. inside the function around the critical sections only. By doing this, a 16-thread process jumped back from 575 to 675 krps. This should be backported to 2.3 as the situation degraded there, and maybe later to 2.2.	2021-02-18 10:06:45 +01:00
Amaury Denoyelle	36441f46c4	MINOR: connection: remove pointers for prehash in conn_hash_params Replace unneeded pointers for sni/proxy prehash by plain data type. The code is slightly cleaner.	2021-02-17 16:43:07 +01:00
Amaury Denoyelle	aba507334b	BUG/MAJOR: connection: prevent double free if conn selected for removal Always try to remove a connexion from its toremove_list in conn_free. This prevents a double-free in case the connection is freed but was already added in toremove_list. This bug was easily reproduced by running 4-5 runs of inject on a single-thread instance of haproxy : $ inject -u 10000 -d 10 -G 127.0.0.1:20080 A crash would soon be triggered in srv_cleanup_toremove_connections. This does not need to be backported.	2021-02-16 16:17:29 +01:00
William Dauchy	3679d0c794	MINOR: stats: add helper to get status string move listen status to a helper, defining both status enum and string definition. this will be helpful to be reused in prometheus code. It also removes this hard-to-read nested ternary. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-15 14:13:32 +01:00
William Dauchy	655e14ef17	MEDIUM: stats: allow to select one field in `stats_fill_li_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. From this patch it should be possible to add support for listen stats in prometheus. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-15 14:13:32 +01:00
Emeric Brun	fd647d5f5f	MEDIUM: dns: adds code to support pipelined DNS requests over TCP. This patch introduce the "dns_stream_nameserver" to use DNS over TCP on strict nameservers. For the upper layer it is analog to the api used with udp nameservers except that the user que switch the name server in "stream" mode at the init using "dns_stream_init". The fallback from UDP to TCP is not handled and this is not the purpose of this feature. This is done to choose the transport layer during the initialization. Currently there is a hardcoded limit of 4 pipelined transactions per TCP connections. A batch of idle connections is expired every 5s. This code is designed to support a maximum DNS message size on TCP: 64k. Note: this code won't perform retry on unanswered queries this should be handled by the upper layer	2021-02-13 10:03:46 +01:00
Emeric Brun	c943799c86	MEDIUM: resolvers/dns: split dns.c into dns.c and resolvers.c This patch splits current dns.c into two files: The first dns.c contains code related to DNS message exchange over UDP and in future other TCP. We try to remove depencies to resolving to make it usable by other stuff as DNS load balancing. The new resolvers.c inherit of the code specific to the actual resolvers. Note: It was really difficult to obtain a clean diff dur to the amount of moved code. Note2: Counters and stuff related to stats is not cleany separated because currently counters for both layers are merged and hard to separate for now.	2021-02-13 10:03:46 +01:00
Emeric Brun	d26a6237ad	MEDIUM: resolvers: split resolving and dns message exchange layers. This patch splits recv and send functions in two layers. the lowest is responsible of DNS message transactions over the network. Doing this we could use DNS message layer for something else than resolving. Load balancing for instance. This patch also re-works the way to init a nameserver and introduce the new struct dns_dgram_server to prepare the arrival of dns_stream_server and the support of DNS over TCP. The way to retry a send failure of a request because of EAGAIN was re-worked. Previously there was no control and all "pending" queries were re-played each time it reaches a EAGAIN. This patch introduce a ring to stack messages in case of sent failure. This patch is emptied if poller shows that the socket is ready again to push messages.	2021-02-13 09:51:10 +01:00
Emeric Brun	d3b4495f0d	MINOR: resolvers: rework dns stats prototype because specific to resolvers Counters are currently stored into lowlevel nameservers struct but most of them are resolving layer data and increased in the upper layer So this patch renames the prototype used to allocate/dump them with prefix 'resolv' waiting for a clean split.	2021-02-13 09:43:18 +01:00
Emeric Brun	6a2006ae37	MINOR: resolvers: replace nameserver's resolver ref by generic parent pointer This will allow to use nameservers in something else than a resolver section (load balancing for instance).	2021-02-13 09:43:18 +01:00
Emeric Brun	d30e9a1709	MINOR: resolvers: rework prototype suffixes to split resolving and dns. A lot of prototypes in dns.h are specific to resolvers and must be renamed to split resolving and DNS layers.	2021-02-13 09:43:18 +01:00
Emeric Brun	456de77bdb	MINOR: resolvers: renames resolvers DNS_UPD_* returncodes to RSLV_UPD_* This patch renames some #defines prefixes from DNS to RSLV.	2021-02-13 09:43:18 +01:00
Emeric Brun	30c766ebbc	MINOR: resolvers: renames resolvers DNS_RESP_* errcodes RSLV_RESP_* This patch renames some #defines prefixes from DNS to RSLV.	2021-02-13 09:43:18 +01:00
Emeric Brun	21fbeedf97	MINOR: resolvers: renames some dns prefixed types using resolv prefix. @@ -119,8 +119,8 @@ struct act_rule { - } dns; /* dns resolution / + } resolv; / resolving */ -struct dns_options { +struct resolv_options {	2021-02-13 09:43:18 +01:00
Emeric Brun	08622d3c0a	MINOR: resolvers: renames some resolvers specific types to not use dns prefix This patch applies those changes on names: -struct dns_resolution { +struct resolv_resolution { -struct dns_requester { +struct resolv_requester { -struct dns_srvrq { +struct resolv_srvrq { @@ -185,12 +185,12 @@ struct stream { struct { - struct dns_requester dns_requester; + struct resolv_requester requester; ... - } dns_ctx; + } resolv_ctx;	2021-02-13 09:43:18 +01:00
Emeric Brun	750fe79cd0	MINOR: resolvers: renames type dns_resolvers to resolvers. It also renames 'dns_resolvers' head list to sec_resolvers to avoid conflicts with local variables 'resolvers'.	2021-02-13 09:43:17 +01:00
Emeric Brun	85914e9d9b	MINOR: resolvers: renames some resolvers internal types and removes dns prefix Some types are specific to resolver code and a renamed using the 'resolv' prefix instead 'dns'. -struct dns_query_item { +struct resolv_query_item { -struct dns_answer_item { +struct resolv_answer_item { -struct dns_response_packet { +struct resolv_response {	2021-02-13 09:43:17 +01:00
Emeric Brun	67f830d29d	BUG/MINOR: resolvers: fix attribute packed struct for dns This patch adds the attribute packed on struct dns_question because it is directly memcpy to network building a response. This patch also removes the commented line: // struct list options; /* list of option records */ because it is also used directly using memcpy to build a request and must not contain host data.	2021-02-13 09:43:17 +01:00
Emeric Brun	50c870e4de	BUG/MINOR: dns: add missing sent counter and parent id to dns counters. Resolv callbacks are also updated to rely on counters and not on nameservers. "show stat domain dns" will now show the parent id (i.e. resolvers section name).	2021-02-13 09:43:17 +01:00
Emeric Brun	e14b98c08e	MINOR: ring: adds new ring_init function. Adds the new ring_init function to initialize a pre-allocated ring struct using the given memory area.	2021-02-13 09:43:17 +01:00
Willy Tarreau	49962b58d0	MINOR: peers/cli: do not dump the peers dictionaries by default on "show peers" The "show peers" output has become huge due to the dictionaries making it less readable. Now this feature has reached a certain level of maturity which doesn't warrant to dump it all the time, given that it was essentially needed by developers. Let's make it optional, and disabled by default, only when "show peers dict" is requested. The default output reminds about the command. The output has been divided by 5 : $ socat - /tmp/sock1 <<< "show peers dict" \| wc -l 125 $ socat - /tmp/sock1 <<< "show peers" \| wc -l 26 It could be useful to backport this to recent stable versions.	2021-02-12 17:00:52 +01:00
Willy Tarreau	e90904d5a9	MEDIUM: proxy: store the default proxies in a tree by name Now default proxies are stored into a dedicated tree, sorted by name. Only unnamed entries are not kept upon new section creation. The very first call to cfg_parse_listen() will automatically allocate a dummy defaults section which corresponds to the previous static one, since the code requires to have one at a few places. The first immediately visible benefit is that it allows to reuse alloc_new_proxy() to allocate a defaults section instead of doing it by hand. And the secret goal is to allow to keep multiple named defaults section in memory to reuse them from various proxies.	2021-02-12 16:23:46 +01:00
Willy Tarreau	0a0f6a7e4f	MINOR: proxy: support storing defaults sections into their own tree Now we'll have a tree of named defaults sections. The regular insertion and lookup functions take care of the capability in order to select the appropriate tree. A new function proxy_destroy_defaults() removes a proxy from this tree and frees it entirely.	2021-02-12 16:23:46 +01:00
Willy Tarreau	80dc6fea59	MINOR: proxy: add a new capability PR_CAP_DEF In order to more easily distinguish a default proxy from a standard one, let's introduce a new capability PR_CAP_DEF.	2021-02-12 16:23:46 +01:00
Willy Tarreau	7d0c143185	MINOR: cfgparse: move defproxy to cfgparse-listen as a static We don't want to expose this one anymore as we'll soon keep multiple default proxies. Let's move it inside the parser which is the only place which still uses it, and initialize it on the fly once needed instead of doing it at boot time.	2021-02-12 16:23:46 +01:00
Willy Tarreau	bb8669ae28	BUG/MINOR: server: parse_server() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	54fa7e332a	BUG/MINOR: tcpcheck: proxy_parse_check() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	220fd70694	BUG/MINOR: extcheck: proxy_parse_extcheck() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	a3320a0509	MINOR: proxy: move the defproxy freeing code to proxy.c This used to be open-coded in cfgparse-listen.c when facing a "defaults" keyword. Let's move this into proxy_free_defaults(). This code is ugly and doesn't even reset the just freed pointers. Let's not change this yet. This code should probably be merged with a generic proxy deinit function called from deinit(). However there's a catch on uri_auth which cannot be freed because it might be used by one or several proxies. We definitely need refcounts there!	2021-02-12 16:23:46 +01:00
Willy Tarreau	7683893c70	REORG: proxy: centralize the proxy allocation code into alloc_new_proxy() This new function takes over the old open-coding that used to be done for too long in cfg_parse_listen() and it now does everything at once in a proxy-centric function. The function does all the job of allocating the structure, initializing it, presetting its defaults from the default proxy and checking for errors. The code was almost unchanged except for defproxy being passed as a pointer, and the error message being passed using memprintf(). This change will be needed to ease reuse of multiple default proxies, or to create dynamic backends in a distant future.	2021-02-12 16:23:46 +01:00
Willy Tarreau	144289b459	REORG: move init_default_instance() to proxy.c and pass it the defproxy pointer init_default_instance() was still left in cfgparse.c which is not the best place to pre-initialize a proxy. Let's place it in proxy.c just after init_new_proxy(), take this opportunity for renaming it to proxy_preset_defaults() and taking out init_new_proxy() from it, and let's pass it the pointer to the default proxy to be initialized instead of implicitly assuming defproxy. We'll soon be able to exploit this. Only two call places had to be updated.	2021-02-12 16:23:46 +01:00
Willy Tarreau	168a414037	BUILD: proxy: add missing compression-t.h to proxy-t.h struct comp is used in struct proxy but never declared prior to this so depending on where proxy.h is included, touching the <comp> field can break the build.	2021-02-12 16:23:46 +01:00
Willy Tarreau	09f2e77eb1	BUG/MINOR: tcpheck: the source list must be a const in dup_tcpcheck_var() This is just an API bug but it's annoying when trying to tidy the code. The source list passed in argument must be a const and not a variable, as it's typically the list head from a default proxy and must obviously not be modified by the function. No backport is needed as it only impacts new code.	2021-02-12 16:23:46 +01:00
Willy Tarreau	016255a483	BUG/MINOR: http-htx: defpx must be a const in proxy_dup_default_conf_errors() This is just an API bug but it's annoying when trying to tidy the code. The default proxy passed in argument must be a const and not a variable. No backport is needed as it only impacts new code.	2021-02-12 16:23:46 +01:00
Willy Tarreau	5bbc676608	BUG/MINOR: stats: revert the change on ST_CONVDONE In 2.1, commit `ee4f5f83d` ("MINOR: stats: get rid of the ST_CONVDONE flag") introduced a subtle bug. By testing curproxy against defproxy in check_config_validity(), it tried to eliminate the need for a flag to indicate that stats authentication rules were already compiled, but by doing so it left the issue opened for the case where a new defaults section appears after the two proxies sharing the first one: defaults mode http stats auth foo:bar listen l1 bind :8080 listen l2 bind :8181 defaults # just to break above This config results in: [ALERT] 042/113725 (3121) : proxy 'f2': stats 'auth'/'realm' and 'http-request' can't be used at the same time. [ALERT] 042/113725 (3121) : Fatal errors found in configuration. Removing the last defaults remains OK. It turns out that the cleanups that followed that patch render it useless, so the best fix is to revert the change (with the up-to-date flags instead). The flag was marked as belonging to the config. It's not exact but it's the closest to the reality, as it's not there to configure the behavior but ti mention that the config parser did its job. This could be backported as far as 2.1, but in practice it looks like nobody ever hit it.	2021-02-12 16:23:45 +01:00
William Dauchy	d1a7b85a40	MEDIUM: server: support {check,agent}_addr, agent_port in server state logical followup from cli commands addition, so that the state server file stays compatible with the changes made at runtime; use previously added helper to load server attributes. also alloc a specific chunk to avoid mixing with other called functions using it Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	63e6cba12a	MEDIUM: server: add server-states version 2 Even if it is possibly too much work for the current usage, it makes sure we don't break states file from v2.3 to v2.4; indeed, since v2.3, we introduced two new fields, so we put them aside to guarantee we can easily reload from a version 1. The diff seems huge but there is no specific change apart from: - introduce v2 where it is needed (parsing, update) - move away from switch/case in update to be able to reuse code - move srv lock to the whole function to make it easier this patch confirm how painful it is to maintain this functionality. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
Amaury Denoyelle	1921d20fff	MINOR: connection: use proxy protocol as parameter for srv conn hash Use the proxy protocol frame if proxy protocol is activated on the server line. Do not add anymore these connections in the private list. If some requests are made with the same proxy fields, they can reuse the idle connection. The reg-tests proxy_protocol_send_unique_id must be adapted has it relied on the side effect behavior that every requests from a same connection reused a private server connection. Now, a new connection is created as expected if the proxy protocol fields differ.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	d10a200f62	MINOR: connection: use src addr as parameter for srv conn hash The source address is used as an input to the the server connection hash. The address and port are used as separate hash inputs. Do not add anymore these connections in the private list. This parameter is set only if used in the transparent-proxy mode.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	01a287f1e5	MINOR: connection: use dst addr as parameter for srv conn hash The destination address is used as an input to the server connection hash. The address and port are used as separated hash inputs. Note that they are not used when statically specified on the server line. This is only useful for dynamic destination address. This is typically used when the server address is dynamically set via the set-dst action. The address and port are separated hash parameters. Most notably, it should fixed set-dst use case (cf github issue #947).	2021-02-12 12:53:56 +01:00
Amaury Denoyelle	9b626e3c19	MINOR: connection: use sni as parameter for srv conn hash The sni parameter is an input to the server connection hash. Do not add anymore connections with dynamic sni in the private list. Thus, it is now possible to reuse a server connection if they use the same sni.	2021-02-12 12:48:11 +01:00
Amaury Denoyelle	293dcc400e	MINOR: backend: compare conn hash for session conn reuse Compare the connection hash when reusing a connection from the session. This ensures that a private connection is reused only if it shares the same set of parameters.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	1a58aca84e	MINOR: connection: use the srv pointer for the srv conn hash The pointer of the target server is used as a first parameter for the server connection hash calcul. This prevents the hash to be null when no specific parameters are present, and can serve as a simple defense against an attacker trying to reuse a non-conform connection.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	81c6f76d3e	MINOR: connection: prepare hash calcul for server conns This is a preliminary work for the calcul of the backend connection hash. A structure conn_hash_params is the input for the operation, containing the various specific parameters of a connection. The high bits of the hash will reflect the parameters present as input. A set of macros is written to manipulate the connection hash and extract the parameters/payload.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	f232cb3e9b	MEDIUM: connection: replace idle conn lists by eb trees The server idle/safe/available connection lists are replaced with ebmb- trees. This is used to store backend connections, with the new field connection hash as the key. The hash is a 8-bytes size field, used to reflect specific connection parameters. This is a preliminary work to be able to reuse connection with SNI, explicit src/dst address or PROXY protocol.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	5c7086f6b0	MEDIUM: connection: protect idle conn lists with locks This is a preparation work for connection reuse with sni/proxy protocol/specific src-dst addresses. Protect every access to idle conn lists with a lock. This is currently strictly not needed because the access to the list are made with atomic operations. However, to be able to reuse connection with specific parameters, the list storage will be converted to eb-trees. As this structure does not have atomic operation, it is mandatory to protect it with a lock. For this, the takeover lock is reused. Its role was to protect during connection takeover. As it is now extended to general idle conns usage, it is renamed to idle_conns_lock. A new lock section is also instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.	2021-02-12 12:33:04 +01:00
William Dauchy	38cd986c54	BUG/MINOR: server: re-align state file fields number Since commit `3169471964` ("MINOR: Add server port field to server state file.") max_fields was not increased on version number 1. So this patch aims to fix it. This should be backported as far as v1.8, but the numbering should be adpated depending on the version: simply increase the field by 1. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-10 16:25:42 +01:00
William Lallemand	7b41654495	MINOR: ssl: add SSL_SERVER_LOCK label in threads.h Amaury reported that the commit `3ce6eed` ("MEDIUM: ssl: add a rwlock for SSL server session cache") introduced some warning during compilation: include/haproxy/thread.h\|411 col 2\| warning: enumeration value 'SSL_SERVER_LOCK' not handled in switch [-Wswitch] This patch fix the issue by adding the right entry in the switch block. Must be backported where `3ce6eed` is backported. (2.4 only for now)	2021-02-10 16:17:19 +01:00
Willy Tarreau	826f3ab5e6	MINOR: stick-tables/counters: add http_fail_cnt and http_fail_rate data types Historically we've been counting lots of client-triggered events in stick tables to help detect misbehaving ones, but we've been missing the same on the server side, and there's been repeated requests for being able to count the server errors per URL in order to precisely monitor the quality of service or even to avoid routing requests to certain dead services, which is also called "circuit breaking" nowadays. This commit introduces http_fail_cnt and http_fail_rate, which work like http_err_cnt and http_err_rate in that they respectively count events and their frequency, but they only consider server-side issues such as network errors, unparsable and truncated responses, and 5xx status codes other than 501 and 505 (since these ones are usually triggered by the client). Note that retryable errors are purposely not accounted for, so that only what the client really sees is considered. With this it becomes very simple to put some protective measures in place to perform a redirect or return an excuse page when the error rate goes beyond a certain threshold for a given URL, and give more chances to the server to recover from this condition. Typically it could look like this to bypass a URL causing more than 10 requests per second: stick-table type string len 80 size 4k expire 1m store http_fail_rate(1m) http-request track-sc0 base # track host+path, ignore query string http-request return status 503 content-type text/html \ lf-file excuse.html if { sc0_http_fail_rate gt 10 } A more advanced mechanism using gpt0 could even implement high/low rates to disable/enable the service. Reg-test converteers_ref_cnt_never_dec.vtc was updated to test it.	2021-02-10 12:27:01 +01:00
Willy Tarreau	e66ee1a651	BUG/MINOR: intops: fix mul32hi()'s off-by-one mul32hi() multiples a constant a with a variable b from 0 to 0xffffffff and shifts the result by 32 bits. It's visible that it's always impossible to reach the constant a this way because the product always misses exactly one unit of a to be preserved. And this cannot be corrected by the caller either as adding one to the output will only shift the output range, and it's not possible to pass 2^32 on the ratio <b>. The right approach is to add "a" after the multiplication so that the input range is always preserved for all ratio values from 0 to 0xffffffff: (a=0x00000000 * b=0x00000000 + a=0x00000000) >> 32 = 0x00000000 (a=0x00000000 * b=0x00000001 + a=0x00000000) >> 32 = 0x00000000 (a=0x00000000 * b=0xffffffff + a=0x00000000) >> 32 = 0x00000000 (a=0x00000001 * b=0x00000000 + a=0x00000001) >> 32 = 0x00000000 (a=0x00000001 * b=0x00000001 + a=0x00000001) >> 32 = 0x00000000 (a=0x00000001 * b=0xffffffff + a=0x00000001) >> 32 = 0x00000001 (a=0xffffffff * b=0x00000000 + a=0xffffffff) >> 32 = 0x00000000 (a=0xffffffff * b=0x00000001 + a=0xffffffff) >> 32 = 0x00000001 (a=0xffffffff * b=0xffffffff + a=0xffffffff) >> 32 = 0xffffffff This is only used in freq_ctr calculations and the slightly lower value is unlikely to have ever been noticed by anyone. This may be backported though it is not important.	2021-02-09 17:52:50 +01:00
William Lallemand	3ce6eedb37	MEDIUM: ssl: add a rwlock for SSL server session cache When adding the server side support for certificate update over the CLI we encountered a design problem with the SSL session cache which was not locked. Indeed, once a certificate is updated we need to flush the cache, but we also need to ensure that the cache is not used during the update. To prevent the use of the cache during an update, this patch introduce a rwlock for the SSL server session cache. In the SSL session part this patch only lock in read, even if it writes. The reason behind this, is that in the session part, there is one cache storage per thread so it is not a problem to write in the cache from several threads. The problem is only when trying to write in the cache from the CLI (which could be on any thread) when a session is trying to access the cache. So there is a write lock in the CLI part to prevent simultaneous access by a session and the CLI. This patch also remove the thread_isolate attempt which is eating too much CPU time and was not protecting from the use of a free ptr in the session.	2021-02-09 09:43:44 +01:00
Ilya Shipitsin	acf84595a7	CLEANUP: assorted typo fixes in the code and comments This is 17th iteration of typo fixes	2021-02-08 10:49:08 +01:00
Ilya Shipitsin	7bbf5866e0	BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was introduced in `ec60909871` however it was defined as HAVE_SL_CTX_ADD_SERVER_CUSTOM_EXT (missing "S") let us fix typo	2021-02-08 00:11:41 +01:00
Willy Tarreau	4acb99f867	BUG/MINOR: xxhash: make sure armv6 uses memcpy() There was a special case made to allow ARMv6 to use unaligned accesses via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit pointers, not 64-bit ones, leading to bus errors when the compiler emits an ldrd instruction and the input (e.g. a pattern) is not aligned, as in issue #1035. Note that v7 was properly using the packed approach here and was safe, however haproxy versions 2.3 and older use the old r39 xxhash code which has the same issue for armv7. A slightly different fix is required there, by using a different definition of packed for 32 and 64 bits. The problem is really visible when running v7 code on a v8 kernel because such kernels do not implement alignment trap emulation, and the process dies when this happens. This is why in the issue above it was only detected under lxc. The emulation could have been disabled on v7 as well by writing zero to /proc/cpu/alignment though. This commit is a backport of xxhash commit a470f2ef ("update default memory access for armv6"). Thanks to @srkunze for the report and tests, @stgraber for his help on setting up an easy reproducer outside of lxc, and @Cyan4973 for the discussion around the best way to fix this. Details and alternate patches available on https://github.com/Cyan4973/xxHash/issues/490.	2021-02-04 17:14:58 +01:00
William Dauchy	4858fb2e18	MEDIUM: check: align agentaddr and agentport behaviour in the same manner of agentaddr, we now: - permit to set agentport through `port` keyword, like it is the case for agentaddr through `addr` - set the priority on `agent-port` keyword when used - add a flag to be able to test when the value is set like for agentaddr it makes the behaviour between `addr` and `port` more consistent. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 14:00:38 +01:00
William Dauchy	1c921cd748	BUG/MINOR: check: consitent way to set agentaddr small consistency problem with `addr` and `agent-addr` options: for the both options, the last one parsed is always used to set the agent-check addr. Thus these two lines don't have the same behavior: server ... addr <addr1> agent-addr <addr2> server ... agent-addr <addr2> addr <addr1> After this patch `agent-addr` will always be the priority option over `addr`. It means we test the flag before setting agentaddr. We also fix all the places where we did not set the flag to be coherent everywhere. I was not really able to determine where this issue is coming from. So it is probable we may backport it to all stable version where the agent is supported. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 13:55:04 +01:00
William Dauchy	fe03e7d045	MEDIUM: server: adding support for check_port in server state We can currently change the check-port using the cli command `set server check-port` but there is a consistency issue when using server state. This patch aims to fix this problem but will be also a good preparation work to get rid of checkport flag, so we are able to know when checkport was set by config. I am fully aware this is not making github #953 moving forward, I however think this might be acceptable while waiting for a proper solution and resolve consistency problem faced with port settings. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:46:52 +01:00
William Dauchy	69f118d7b6	MEDIUM: check: remove checkport checkaddr flag While trying to fix some consistency problem with the config file/cli (e.g. check-port cli command does not set the flag), we realised checkport flag was not necessarily needed. Indeed tcpcheck uses service port as the last choice if check.port is zero. So we can assume if check.port is zero, it means it was never set by the user, regardless if it is by the cli or config file. In the longterm this will avoid to introduce a new consistency issue if we forget to set the flag. in the same manner of checkport flag, we don't really need checkaddr flag. We can assume if checkaddr is not set, it means it was never set by the user or config. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:43:00 +01:00
William Dauchy	eedb9b13f4	MINOR: stats: improve pending connections description In order to unify prometheus and stats description, we need to clarify the description for pending connections. - remove the BE reference in counters struct, as it is also used in servers - remove reference of `qcur` field in description as it is specific to stats implemention - try to reword cur and max pending connections description Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
Christopher Faulet	7aa3271439	MINOR: checks: Add function to get the result code corresponding to a status The function get_check_status_result() can now be used to get the result code (CHK_RES_) corresponding to a check status (HCHK_STATUS_). It will be used by the Prometheus exporter when reporting the check status of a server.	2021-02-01 15:16:33 +01:00
Willy Tarreau	d597ec2718	MINOR: listener: export manage_global_listener_queue() This one pops up in tasks lists when running against a saturated listener.	2021-01-29 14:29:57 +01:00
Willy Tarreau	02922e19ca	MINOR: session: export session_expire_embryonic() This is only to make it resolve nicely in "show tasks".	2021-01-29 12:27:57 +01:00
Willy Tarreau	fb5401f296	MINOR: listener: export accept_queue_process This is only to make it resolve in "show tasks".	2021-01-29 12:25:23 +01:00
Willy Tarreau	3fb6a7b46e	MINOR: activity: declare a new structure to collect per-function activity The new sched_activity structure will be used to collect task-level activity based on the target function. The principle is to declare a large enough array to make collisions rare (256 entries), and hash the function pointer using a reduced XXH to decide where to store the stats. On first computation an entry is definitely assigned to the array and it's done atomically. A special entry (0) is used to store collisions ("others"). The goal is to make it easy and inexpensive for the scheduler code to use these to store #calls, cpu_time and lat_time for each task.	2021-01-29 12:10:33 +01:00
Willy Tarreau	aa622b822b	MINOR: activity: make profiling more manageable In 2.0, commit `d2d3348ac` ("MINOR: activity: enable automatic profiling turn on/off") introduced an automatic mode to enable/disable profiling. The problem is that the automatic mode automatically changes to on/off, which implied that the forced on/off modes aren't sticky anymore. It's annoying when debugging because as soon as the load decreases, profiling stops. This makes a small change which ought to have been done first, which consists in having two states for "auto" (auto-on, auto-off) to distinguish them from the forced states. Setting to "auto" in the config defaults to "auto-off" as before, and setting it on the CLI switches to auto but keeps the current operating state. This is simple enough to be backported to older releases if needed.	2021-01-29 12:10:33 +01:00
Willy Tarreau	4deeb1055f	MINOR: tools: add print_time_short() to print a condensed duration value When reporting some values in debugging output we often need to have some condensed, stable-length values. This function prints a duration from nanosecond to years with at least 4 digits of accuracy using the most suitable unit, always on 7 chars.	2021-01-29 12:10:33 +01:00
Christopher Faulet	405f054652	MINOR: h1: Raise the chunk size limit up to (2^52 - 1) The allowed chunk size was historically limited to 2GB to avoid risk of overflow. This restriction is no longer necessary because the chunk size is immediately stored into a 64bits integer after the parsing. Thus, it is now possible to raise this limit. However to never fed possibly bogus values from languages that use floats for their integers, we don't get more than 13 hexa-digit (2^52 - 1). 4 petabytes is probably enough ! This patch should fix the issue #1065. It may be backported as far as 2.1. For the 2.0, the legacy HTTP part must be reviewed. But there is honestely no reason to do so.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	f9dcbeeab3	MEDIUM: h2: send connect protocol h2 settings In order to announce support for the Extended CONNECT h2 method by haproxy, always send the ENABLE_CONNECT_PROTOCOL h2 settings. This new setting has been described in the rfc 8441. After receiving ENABLE_CONNECT_PROTOCOL, the client is free to use the Extended CONNECT h2 method. This can notably be useful for the support of websocket handshake on http/2.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	c9a0afcc32	MEDIUM: h2: parse Extended CONNECT request to htx Support for the rfc 8441 Bootstraping WebSockets with HTTP/2 Convert an Extended CONNECT HTTP/2 request into a htx representation. The htx message uses the GET method with an Upgrade header field to be fully compatible with the equivalent HTTP/1.1 Upgrade mechanism. The Extended CONNECT is of the following form : :method = CONNECT :protocol = websocket :scheme = https :path = /chat :authority = server.example.com The new pseudo-header :protocol has been defined and is used to identify an Extended CONNECT method. Contrary to standard CONNECT, Extended CONNECT must have :scheme, :path and :authority defined.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	aad333a9fc	MEDIUM: h1: add a WebSocket key on handshake if needed Add the header Sec-Websocket-Key when generating a h1 handshake websocket without this header. This is the case when doing h2-h1 conversion. The key is randomly generated and base64 encoded. It is stored on the session side to be able to verify response key and reject it if not valid.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	7416274914	MEDIUM: h2: parse Extended CONNECT reponse to htx Support for the rfc 8441 Bootstraping WebSockets with HTTP/2 Convert a 200 status reply from an Extended CONNECT request into a htx representation. The htx message is set to 101 status code to be fully compatible with the equivalent HTTP/1.1 Upgrade mechanism. This conversion is only done if the stream flags H2_SF_EXT_CONNECT_SENT has been set. This is true if an Extended CONNECT request has already been seen on the stream. Besides the 101 status, the additional headers Connection/Upgrade are added to the htx message. The protocol is set from the value stored in h2s. Typically it will be extracted from the client request. This is only used if the client is using h1 as only the HTTP/1.1 101 Response contains the Upgrade header.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	c193823343	MEDIUM: h1: generate WebSocket key on response if needed Add the Sec-Websocket-Accept header on a websocket handshake response. This header may be missing if a h2 server is used with a h1 client. The response key is calculated following the rfc6455. For this, the handshake request key must be stored in the h1 session, as a new field name ws_key. Note that this is only done if the message has been prealably identified as a Websocket handshake request.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	18ee5c3eb0	MINOR: h1: reject websocket handshake if missing key If a request is identified as a WebSocket handshake, it must contains a websocket key header or else it can be reject, following the rfc6455. A new flag H1_MF_UPG_WEBSOCKET is set on such messages. For the request te be identified as a WebSocket handshake, it must contains the headers: Connection: upgrade Upgrade: websocket This commit is a compagnon of "MEDIUM: h1: generate WebSocket key on response if needed" and "MEDIUM: h1: add a WebSocket key on handshake if needed". Indeed, it ensures that a WebSocket key is added only from a http/2 side and not for a http/1 bogus peer.	2021-01-28 16:37:14 +01:00
Christopher Faulet	7d247f0771	MINOR: h2/mux-h2: Add flags to notify the response is known to have no body The H2 message flag H2_MSGF_BODYLESS_RSP is now used during the request or the response parsing to notify the mux that, considering the parsed message, the response is known to have no body. This happens during HEAD requests parsing and during 204/304 responses parsing. On the H2 multiplexer, the equivalent flag is set on H2 streams. Thus the H2_SF_BODYLESS_RESP flag is set on a H2 stream if the H2_MSGF_BODYLESS_RSP is found after a HEADERS frame parsing. Conversely, this flag is also set when a HEADERS frame is emitted for HEAD requests and for 204/304 responses. The H2_SF_BODYLESS_RESP flag will be used to ignore data payload from the response but not the trailers.	2021-01-28 16:37:14 +01:00
Christopher Faulet	d1ac2b90cd	MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead The EOM block may be removed. The HTX_FL_EOM flags is enough. Most of time, to know if the end of the message is reached, we just need to have an empty HTX message with HTX_FL_EOM flag set. It may also be detected when the last block of a message with HTX_FL_EOM flag is manipulated. Removing EOM blocks simplifies the HTX message filling. Indeed, there is no more edge problems when the message ends but there is no more space to write the EOM block. However, some part are more tricky. Especially the compression filter or the FCGI mux. The compression filter must finish the compression on the last DATA block. Before it was performed on the EOM block, an extra DATA block with the checksum was added. Now, we must detect the last DATA block to be sure to finish the compression. The FCGI mux on its part must be sure to reserve the space for the empty STDIN record on the last DATA block while this record was inserted on the EOM block. The H2 multiplexer is probably the part that benefits the most from this change. Indeed, it is now fairly easier to known when to set the ES flag. The HTX documentaion has been updated accordingly.	2021-01-28 16:37:14 +01:00
Christopher Faulet	789a472674	MINOR: htx: Add a function to know if a block is the only one in a message The htx_is_unique_blk() function may now be used to know if a block is the only one in an HTX message, excluding all unused blocks. Note the purpose of this function is not to know if a block is the last one of an HTTP message. This means no more data part from the message are expected, except tunneled data. It only says if a block is alone in an HTX message.	2021-01-28 16:37:14 +01:00
Christopher Faulet	42432f347f	MINOR: htx: Rename HTX_FL_EOI flag into HTX_FL_EOM The HTX_FL_EOI flag is not well named. For now, it is not very used. But that will change. It will replace the EOM block. Thus, it is renamed.	2021-01-28 16:37:14 +01:00
Christopher Faulet	576c358508	MINOR: htx/http-ana: Save info about Upgrade option in the Connection header Add an HTX start-line flag and its counterpart into the HTTP message to track the presence of the Upgrade option into the Connection header. This way, without parsing the Connection header again, it will be easy to know if a client asks for a protocol upgrade and if the server agrees to do so. It will also be easy to perform some conformance checks when a 101-switching-protocols is received.	2021-01-28 16:27:48 +01:00
Christopher Faulet	4ef84c9c41	MINOR: stream: Add a function to validate TCP to H1 upgrades TCP to H1 upgrades are buggy for now. When such upgrade is performed, a crash is experienced. The bug is the result of the recent H1 mux refactoring, and more specifically because of the commit `c4bfa59f1` ("MAJOR: mux-h1: Create the client stream as later as possible"). Indeed, now the H1 mux is responsible to create the frontend conn-stream once the request headers are fully received. Thus the TCP to H1 upgrade is a problem because the frontend conn-stream already exists. To fix the bug, we must keep this conn-stream and the associate stream and use it in the H1 mux. To do so, the upgrade will be performed in two steps. First, the mux is upgraded from mux-pt to mux-h1. Then, the mux-h1 performs the stream upgrade, once the request headers are fully received and parsed. To do so, stream_upgrade_from_cs() must be used. This function set the SF_HTX flags to switch the stream to HTX mode, it removes the SF_IGNORE flags and eventually it fills the request channel with some input data. This patch is required to fix the TCP to H1 upgrades and is intimately linked with the next commits.	2021-01-28 16:27:48 +01:00
Amaury Denoyelle	3f07c20fab	BUG/MEDIUM: session: only retrieve ready idle conn from session A bug was introduced by the early insertion of idle connections at the end of connect_server. It is possible to reuse a connection not yet ready waiting for an handshake (for example with proxy protocol or ssl). A wrong duplicate xprt_handshake_io_cb tasklet is thus registered as a side-effect. This triggers the BUG_ON statement of xprt_handshake_subscribe : BUG_ON(ctx->subs && ctx->subs != es); To counter this, a check is now present in session_get_conn to only return a connection without the flag CO_FL_WAIT_XPRT. This might cause sometimes the creation of dedicated server connections when in theory reuse could have been used, but probably only occurs rarely in real condition. This behavior is present since commit : MEDIUM: connection: Add private connections synchronously in session server list It could also be further exagerated by : MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking It can be backported up to 2.3. NOTE : This bug seems to be only reproducible with mode tcp, for an unknown reason. However, reuse should never happen when not in http mode. This improper behavior will be the subject of a dedicated patch. This bug can easily be reproducible with the following config (a webserver is required to accept proxy protocol on port 31080) : global defaults mode tcp timeout connect 1s timeout server 1s timeout client 1s listen li bind 0.0.0.0:4444 server bla1 127.0.0.1:31080 check send-proxy-v2 with the inject client : $ inject -u 10000 -d 10 -G 127.0.0.1:4444 This should fix the github issue #1058.	2021-01-28 14:16:27 +01:00
Tim Duesterhus	491be54cf1	BUILD: Include stdlib.h in compiler.h if DEBUG_USE_ABORT is set Building with `"DEBUG=-DDEBUG_STRICT=1 -DDEBUG_USE_ABORT=1"` previously emitted the warning: In file included from include/haproxy/api.h:35:0, from src/mux_pt.c:13: include/haproxy/buf.h: In function ‘br_init’: include/haproxy/bug.h:42:90: warning: implicit declaration of function ‘abort’ [-Wimplicit-function-declaration] #define ABORT_NOW() do { extern void ha_backtrace_to_stderr(); ha_backtrace_to_stderr(); abort(); } while (0) ^ include/haproxy/bug.h:56:21: note: in expansion of macro ‘ABORT_NOW’ #define CRASH_NOW() ABORT_NOW() ^ include/haproxy/bug.h:68:4: note: in expansion of macro ‘CRASH_NOW’ CRASH_NOW(); \ ^ include/haproxy/bug.h:62:35: note: in expansion of macro ‘__BUG_ON’ #define _BUG_ON(cond, file, line) __BUG_ON(cond, file, line) ^ include/haproxy/bug.h:61:22: note: in expansion of macro ‘_BUG_ON’ #define BUG_ON(cond) _BUG_ON(cond, __FILE__, __LINE__) ^ include/haproxy/buf.h:875:2: note: in expansion of macro ‘BUG_ON’ BUG_ON(size < 2); ^ This patch fixes that issue. The `DEBUG_USE_ABORT` option exists for use with static analysis tools. No backport needed.	2021-01-27 12:44:39 +01:00
William Lallemand	795bd9ba3a	CLEANUP: ssl: remove SSL_CTX function parameter Since the server SSL_CTX is now stored in the ckch_inst, it is not needed anymore to pass an SSL_CTX to ckch_inst_new_load_srv_store() and ssl_sock_load_srv_ckchs().	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	bb470aa327	MINOR: ssl: Remove client_crt member of the server's ssl context The client_crt member is not used anymore since the server's ssl context initialization now behaves the same way as the bind lines one (using ckch stores and instances).	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	f3eedfe195	MEDIUM: ssl: Enable backend certificate hot update When trying to update a backend certificate, we should find a server-side ckch instance thanks to which we can rebuild a new ssl context and a new ckch instance that replace the previous ones in the server structure. This way any new ssl session will be built out of the new ssl context and the newly updated certificate. This resolves a subpart of GitHub issue #427 (the certificate part)	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	d817dc733e	MEDIUM: ssl: Load client certificates in a ckch for backend servers In order for the backend server's certificate to be hot-updatable, it needs to fit into the implementation used for the "bind" certificates. This patch follows the architecture implemented for the frontend implementation and reuses its structures and general function calls (adapted for the server side). The ckch store logic is kept and a dedicated ckch instance is used (one per server). The whole sni_ctx logic was not kept though because it is not needed. All the new functions added in this patch are basically server-side copies of functions that already exist on the frontend side with all the sni and bind_cond references removed. The ckch_inst structure has a new 'is_server_instance' flag which is used to distinguish regular instances from the server-side ones, and a new pointer to the server's structure in case of backend instance. Since the new server ckch instances are linked to a standard ckch_store, a lookup in the ckch store table will succeed so the cli code used to update bind certificates needs to be covered to manage those new server side ckch instances.	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	442b7f2238	MINOR: ssl: Server ssl context prepare function refactoring Split the server's ssl context initialization into the general ssl related initializations and the actual initialization of a single SSL_CTX structure. This way the context's initialization will be usable by itself from elsewhere.	2021-01-26 15:19:36 +01:00
Christopher Faulet	6071c2d12d	BUG/MEDIUM: filters/htx: Fix data forwarding when payload length is unknown It is only a problem on the response path because the request payload length it always known. But when a filter is registered to analyze the response payload, the filtering may hang if the server closes just after the headers. The root cause of the bug comes from an attempt to allow the filters to not immediately forward the headers if necessary. A filter may choose to hold the headers by not forwarding any bytes of the payload. For a message with no payload but a known payload length, there is always a EOM block to forward. Thus holding the EOM block for bodyless messages is a good way to also hold the headers. However, messages with an unknown payload length, there is no EOM block finishing the message, but only a SHUTR flag on the channel to mark the end of the stream. If there is no payload when it happens, there is no payload at all to forward. In the filters API, it is wrongly detected as a condition to not forward the headers. Because it is not the most used feature and not the obvious one, this patch introduces another way to hold the message headers at the begining of the forwarding. A filter flag is added to explicitly says the headers should be hold. A filter may choose to set the STRM_FLT_FL_HOLD_HTTP_HDRS flag and not forwad anything to hold the headers. This flag is removed at each call, thus it must always be explicitly set by filters. This flag is only evaluated if no byte has ever been forwarded because the headers are forwarded with the first byte of the payload. reg-tests/filters/random-forwarding.vtc reg-test is updated to also test responses with unknown payload length (with and without payload). This patch must be backported as far as 2.0.	2021-01-26 09:53:52 +01:00
Tim Duesterhus	3d7f9ff377	MINOR: abort() on my_unreachable() when DEBUG_USE_ABORT is set. Hopefully this helps static analysis tools detecting that the code after that call is unreachable. See GitHub Issue #1075.	2021-01-26 09:33:18 +01:00
William Dauchy	d3a9a4992b	MEDIUM: stats: allow to select one field in `stats_fill_sv_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. This patch follows what has already been done on frontend and backend side. From this patch it should be possible to remove most of the duplicate code on prometheuse side for the server. A few things to note though: - state require prior calculation, so I moved that to a sort of helper `stats_fill_be_stats_computestate`. - all ST_F*TIME fields requires some minor compute, so I moved it at te beginning of the function under a condition. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-26 09:24:51 +01:00
William Dauchy	da3b466fc2	MEDIUM: stats: allow to select one field in `stats_fill_be_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. This patch follows what has already been done on frontend side. From this patch it should be possible to remove most of the duplicate code on prometheuse side for the backend A few things to note though: - status and uweight field requires prior compute, so I moved that to a sort of helper `stats_fill_be_stats_computesrv`. - all ST_F*TIME fields requires some minor compute, so I moved it at te beginning of the function under a condition. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-26 09:24:19 +01:00
William Dauchy	2107a0faf5	CLEANUP: stats: improve field selection for frontend http fields while working on backend/servers I realised I could have written that in a better way and avoid one extra break. This is slightly improving readiness. also while being here, fix function declaration which was not 100% accurate. this patch does not change the behaviour of the code. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-25 15:53:28 +01:00
Ilya Shipitsin	1fc44d494a	BUILD: ssl: guard Client Hello callbacks with HAVE_SSL_CLIENT_HELLO_CB macro instead of openssl version let us introduce new macro HAVE_SSL_CLIENT_HELLO_CB and guard callback functions with it	2021-01-22 20:45:24 +01:00
Willy Tarreau	2bfce7e424	MINOR: debug: let ha_dump_backtrace() dump a bit further for some callers The dump state is now passed to the function so that the caller can adjust the behavior. A new series of 4 values allow to stop after dumping main instead of before it or any of the usual loops. This allows to also report BUG_ON() that could happen very high in the call graph (e.g. startup, or the scheduler itself) while still understanding what the call path was.	2021-01-22 14:48:34 +01:00
Willy Tarreau	5baf4fe31a	MEDIUM: debug: now always print a backtrace on CRASH_NOW() and friends The purpose is to enable the dumping of a backtrace on BUG_ON(). While it's very useful to know that a condition was met, very often some caller context is missing to figure how the condition could happen. From now on, on systems featuring backtrace, a backtrace of the calling thread will also be dumped to stderr in addition to the unexpected condition. This will help users of DEBUG_STRICT as they'll most often find this backtrace in their logs even if they can't find their core file. A new "debug dev bug" expert-mode CLI command was added to test the feature.	2021-01-22 14:18:34 +01:00
Willy Tarreau	a8459b28c3	MINOR: debug: create ha_backtrace_to_stderr() to dump an instant backtrace This function calls the ha_dump_backtrace() function with a locally allocated buffer and sends the output slightly indented to fd #2. It's meant to be used as an emergency backtrace dump.	2021-01-22 14:15:36 +01:00
Willy Tarreau	123fc9786a	MINOR: debug: extract the backtrace dumping code to its own function The backtrace dumping code was located into the thread dump function but it looks particularly convenient to be able to call it to produce a dump in other situations, so let's move it to its own function and make sure it's called last in the function so that we can benefit from tail merging to save one entry.	2021-01-22 13:52:41 +01:00
Willy Tarreau	2f1227eb3f	MINOR: debug: always export the my_backtrace function In order to simplify the code and remove annoying ifdefs everywhere, let's always export my_backtrace() and make it adapt to the situation and return zero if not supported. A small update in the thread dump function was needed to make sure we don't use its results if it fails now.	2021-01-22 12:12:29 +01:00
William Dauchy	b9577450ea	MINOR: contrib/prometheus-exporter: use fill_fe_stats for frontend dump use `stats_fill_fe_stats` when possible to avoid duplicating code; make use of field selector to get the needed field only. this should not introduce any difference of output. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
William Dauchy	0ef54397b0	MEDIUM: stats: allow to select one field in `stats_fill_fe_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. From this patch it should be possible to remove most of the duplicate code on prometheuse side for the frontend. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
William Dauchy	defd15685e	MINOR: stats: add new start time field Another patch in order to try to reconciliate haproxy stats and prometheus. Here I'm adding a proper start time field in order to make proper use of uptime field. That being done we can move the calculation in `fill_info` Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
William Dauchy	a8766cfad1	MINOR: stats: duplicate 3 fields in bytes in info in order to prepare a possible merge of fields between haproxy stats and prometheus, duplicate 3 fields: INF_MEMMAX INF_POOL_ALLOC INF_POOL_USED Those were specifically named in MB unit which is not what prometheus recommends. We therefore used them but changed the unit while doing the calculation. It created a specific case for that, up to the description. This patch: - removes some possible confusion, i.e. using MB field for bytes - will permit an easier merge of fields such as description First consequence for now, is that we can remove the calculation on prometheus side and move it on `fill_info`. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
Christopher Faulet	142dd33912	MINOR: muxes: Add exit status for errors about not implemented features The MUX_ES_NOTIMPL_ERR exit status is added to allow the multiplexers to report errors about not implemented features. This will be used by the H1 mux to return 501-not-implemented errors.	2021-01-21 15:21:12 +01:00
Christopher Faulet	e095f31d36	MINOR: http: Add HTTP 501-not-implemented error message Add the support for the 501-not-implemented status code with the corresponding default message. The documentation is updated accordingly because it is now part of status codes HAProxy may emit via an errorfile or a deny/return HTTP action.	2021-01-21 15:21:12 +01:00
Christopher Faulet	8f100427c4	BUG/MEDIUM: tcpcheck: Don't destroy connection in the wake callback context When a tcpcheck ruleset uses multiple connections, the existing one must be closed and destroyed before openning the new one. This part is handled in the tcpcheck_main() function, when called from the wake callback function (wake_srv_chk). But it is indeed a problem, because this function may be called from the mux layer. This means a mux may call the wake callback function of the data layer, which may release the connection and the mux. It is easy to see how it is hazardous. And actually, depending on the scheduling, it leads to crashes. Thus, we must avoid to release the connection in the wake callback context, and move this part in the check's process function instead. To do so, we rely on the CHK_ST_CLOSE_CONN flags. When a connection must be replaced by a new one, this flag is set on the check, in tcpcheck_main() function, and the check's task is woken up. Then, the connection is really closed in process_chk_conn() function. This patch must be backported as far as 2.2, with some adaptations however because the code is not exactly the same.	2021-01-21 15:21:12 +01:00
Willy Tarreau	8050efeacb	MINOR: cli: give the show_fd helpers the ability to report a suspicious entry Now the show_fd helpers at the transport and mux levels return an integer which indicates whether or not the inspected entry looks suspicious. When an entry is reported as suspicious, "show fd" will suffix it with an exclamation mark ('!') in the dump, that is supposed to help detecting them. For now, helpers were adjusted to adapt to the new API but none of them reports any suspicious entry yet.	2021-01-21 08:58:15 +01:00
Willy Tarreau	108a271049	MINOR: xprt: add a new show_fd() helper to complete some "show fd" dumps. Just like we did for the muxes, now the transport layers will have the ability to provide helpers to report more detailed information about their internal context. When the helper is not known, the pointer continues to be dumped as-is if it's not NULL. This way a transport with no context nor dump function will not add a useless "xprt_ctx=(nil)" but the pointer will be emitted if valid or if a helper is defined.	2021-01-20 17:17:39 +01:00
Willy Tarreau	45fd1030d5	CLEANUP: tools: make resolve_sym_name() take a const pointer When `0c439d895` ("BUILD: tools: make resolve_sym_name() return a const") was written, the pointer argument ought to have been turned to const for more flexibility. Let's do it now.	2021-01-20 17:17:39 +01:00
Tim Duesterhus	1d66e396bf	MINOR: cache: Remove the `hash` part of the accept-encoding secondary key As of commit `6ca89162dc` this hash no longer is required, because unknown encodings are not longer stored and known encodings do not use the cache.	2021-01-18 15:01:41 +01:00
Willy Tarreau	31ffe9fad0	MINOR: pattern: add the missing generation ID manipulation functions The functions needed to commit a pattern file generation number or increase it were still missing. Better not have the caller play with these.	2021-01-15 14:41:16 +01:00
Willy Tarreau	dc2410d093	CLEANUP: pattern: rename pat_ref_commit() to pat_ref_commit_elt() It's about the third time I get confused by these functions, half of which manipulate the reference as a whole and those manipulating only an entry. For me "pat_ref_commit" means committing the pattern reference, not just an element, so let's rename it. A number of other ones should really be renamed before 2.4 gets released :-/	2021-01-15 14:11:59 +01:00
Christopher Faulet	d4a83dd6b3	MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities This function must be used to emit an alert if a proxy does not have at least one of the requested capabilities. An additional message may be appended to the alert.	2021-01-13 17:45:34 +01:00
William Dauchy	5d9b8f3c93	MINOR: contrib/prometheus-exporter: use fill_info for process dump use `stats_fill_info` when possible to avoid duplicating code. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-13 15:19:00 +01:00
Thayne McCombs	8f0cc5c4ba	CLEANUP: Fix spelling errors in comments This is from the output of codespell. It's done at once over a bunch of files and only affects comments, so there is nothing user-visible. No backport needed.	2021-01-08 14:56:32 +01:00
William Dauchy	5a982a7165	MINOR: contrib/prometheus-exporter: export build_info commit `c55a626217` ("MINOR: contrib/prometheus-exporter: Add missing global and per-server metrics") is renaming two metrics between v2.2 and v2.3: server_idle_connections_current server_idle_connections_limit It is breaking some tools which are making use of those metrics while supporting several haproxy versions. This build_info will permit tools which make use of metrics to be able to match the haproxy version and change the list of expected metrics. This was possible using the haproxy stats socket but not with prometheus export. This patch follows prometheus best pratices to export specific software informations. It is adding a new field `build_info` so we can extend it to other parameters if needed in the future. example output: # HELP haproxy_process_build_info HAProxy build info. # TYPE haproxy_process_build_info gauge haproxy_process_build_info{version="2.4-dev5-2e1a3f-5"} 1 Even though it is not a bugfix, this patch will make more sense when backported up to >= 2.0 Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-08 14:48:13 +01:00
Ilya Shipitsin	b8888ab557	CLEANUP: assorted typo fixes in the code and comments This is 15th iteration of typo fixes	2021-01-06 17:32:03 +01:00
Ilya Shipitsin	1e9a66603f	CLEANUP: assorted typo fixes in the code and comments This is 14th iteration of typo fixes	2021-01-06 16:26:50 +01:00
Fr�d�ric L�caille	242fb1b639	MINOR: quic: Drop packets with STREAM frames with wrong direction. A server initiates streams with odd-numbered stream IDs. Also add useful traces when parsing STREAM frames.	2021-01-04 12:31:28 +01:00
Fr�d�ric L�caille	6c1e36ce55	CLEANUP: quic: Remove useless QUIC event trace definitions. Remove QUIC_EV_CONN_E* event trace macros which were defined for errors. Replace QUIC_EV_CONN_ECHPKT by QUIC_EV_CONN_BCFRMS used in qc_build_cfrms()	2021-01-04 12:31:28 +01:00
Fr�d�ric L�caille	164096eb76	MINOR: qpack: Add static header table definitions for QPACK. As HPACK, QPACK makes usage of a static header table.	2021-01-04 12:31:28 +01:00
Tim Duesterhus	54182ec9d7	CLEANUP: Apply the coccinelle patch for `XXXcmp()` on include/ Compare the various `XXXcmp()` functions against zero.	2021-01-04 10:09:02 +01:00
Thayne McCombs	92149f9a82	MEDIUM: stick-tables: Add srvkey option to stick-table This allows using the address of the server rather than the name of the server for keeping track of servers in a backend for stickiness. The peers code was also extended to support feeding the dictionary using this key instead of the name. Fixes #814	2020-12-31 10:04:54 +01:00
Remi Tricot-Le Breton	ce9e7b2521	MEDIUM: cache: Manage a subset of encodings in accept-encoding normalizer The accept-encoding normalizer now explicitely manages a subset of encodings which will all have their own bit in the encoding bitmap stored in the cache entry. This way two requests with the same primary key will be served the same cache entry if they both explicitely accept the stored response's encoding, even if their respective secondary keys are not the same and do not match the stored response's one. The actual hash of the accept-encoding will still be used if the response's encoding is unmanaged. The encoding matching and the encoding weight parsing are done for every subpart of the accept-encoding values, and a bitmap of accepted encodings is built for every request. It is then tested upon any stored response that has the same primary key until one with an accepted encoding is found. The specific "identity" and "*" accept-encoding values are managed too. When storing a response in the key, we also parse the content-encoding header in order to only set the response's corresponding encoding's bit in its cache_entry encoding bitmap. This patch fixes GitHub issue #988. It does not need to be backported.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	56e46cb393	MINOR: http: Add helper functions to trim spaces and tabs Add two helper functions that trim leading or trailing spaces and horizontal tabs from an ist string.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	2b5c5cbef6	MINOR: cache: Avoid storing responses whose secondary key was not correctly calculated If any of the secondary hash normalizing functions raises an error, the secondary hash will be unusable. In this case, the response will not be stored anymore.	2020-12-24 17:18:00 +01:00
Fr�d�ric L�caille	f63921fc24	MINOR: quic: Add traces for quic_packet_encrypt(). Add traces to have an idea why this function may fail. In fact in never fails when the passed parameters are correct, especially the lengths. This is not the case when a packet is not correctly built before being encrypted.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	133e8a7146	MINOR: quic: make a packet build fails when qc_build_frm() fails. Even if the size of frames built by qc_build_frm() are computed so that not to overflow a buffer, do not rely on this and always makes a packet build fails if we could not build a frame. Also add traces to have an idea where qc_build_frm() fails. Fixes a memory leak in qc_build_phdshk_apkt().	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	f7e0b8d6ae	MINOR: quic: Add traces for in flght ack-eliciting packet counter. Add trace for this counter. Also shorten its variable name (->ifae_pkts).	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	04ffb66bc9	MINOR: quic: Make usage of the congestion control window. Remove ->ifcdata which was there to control the CRYPTO data sent to the peer so that not to saturate its reception buffer. This was a sort of flow control. Add ->prep_in_flight counter to the QUIC path struct to control the number of bytes prepared to be sent so that not to saturare the congestion control window. This counter is increased each time a packet was built. This has nothing to see with ->in_flight which is the real in flight number of bytes which have really been sent. We are olbiged to maintain two such counters to know how many bytes of data we can prepared before sending them. Modify traces consequently which were useful to diagnose issues about the congestion control window usage.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	c5e72b9868	MINOR: quic: Attempt to make trace more readable As there is a lot of information in this protocol, this is not easy to make the traces readable. We remove here a few of them and shorten some line shortening the variable names.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	8090b51e92	MAJOR: quic: Make usage of ebtrees to store QUIC ACK ranges. Store QUIC ACK ranges in ebtrees in place of lists with a 0(n) time complexity for insertion.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	f46c10cfb1	MINOR: server: Add QUIC definitions to servers. This patch adds QUIC structs to server struct so that to make the QUIC code compile. Also initializes the ebtree to store the connections by connection IDs.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	884f2e9f43	MINOR: listener: Add QUIC info to listeners and receivers. This patch adds a quic_transport_params struct to bind_conf struct used for the listeners. This is to store the QUIC transport parameters for the listeners. Also initializes them when calling str2listener(). Before str2sa_range() it's too early to figure we're going to speak QUIC, and after it's too late as listeners are already created. So it seems that doing it in str2listener() when the protocol is discovered is the best place. Also adds two ebtrees to the underlying receivers to store the connection by connections IDs (one for the original connection IDs, and another one for the definitive connection IDs which really identify the connections. However it doesn't seem normal that it is stored in the receiver nor the listener. There should be a private context in the listener so that protocols can store internal information. This element should in fact be the listener handle. Something still feels wrong, and probably we'll have to make QUIC and SSL co-exist: a proof of this is that there's some explicit code in bind_parse_ssl() to prevent the "ssl" keyword from replacing the xprt.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	0c4e3b09b0	MINOR: quic: Add definitions for QUIC protocol. This patch imports all the definitions for QUIC protocol with few modifications from 20200720-quic branch of quic-dev repository found at https://github.com/haproxytech/quic-dev.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	901ee2f37b	MINOR: ssl: Export definitions required by QUIC. QUIC needs to initialize its BIO and SSL session the same way as for SSL over TCP connections. It needs also to use the same ClientHello callback. This patch only exports functions and variables shared between QUIC and SSL/TCP connections.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	5e3d83a221	MINOR: connection: Add a new xprt to connection. Simply adds XPRT_QUIC new enum to integrate QUIC transport protocol.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	70da889d57	MINOR: quic: Redefine control layer callbacks which are QUIC specific. We add src/quic_sock.c QUIC specific socket management functions as callbacks for the control layer: ->accept_conn, ->default_iocb and ->rx_listening. accept_conn() will have to be defined. The default I/O handler only recvfrom() the datagrams received. Furthermore, ->rx_listening callback always returns 1 at this time but should returns 0 when reloading the processus.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	72f7cb170a	MINOR: connection: Attach a "quic_conn" struct to "connection" struct. This is a simple patch to prepare the integration of QUIC support to come. quic_conn struct is supposed to embed any QUIC specific information for a QUIC connection.	2020-12-23 11:57:26 +01:00
Fr�d�ric L�caille	ca42b2c9d3	MINOR: protocol: Create proto_quic QUIC protocol layer. As QUIC is a connection oriented protocol, this file is almost a copy of proto_tcp without TCP specific features. To suspend/resume a QUIC receiver we proceed the same way as for proto_udp receivers. With the recent updates to the listeners, we don't need a specific set of quic*_add_listener() functions, the default ones are sufficient. The fields declaration were reordered to make the various layers more visible like in other protocols. udp_suspend_receiver/udp_resume_receiver are up-to-date (the check for INHERITED is present) and the code being UDP-specific, it's normal to use UDP here. Note that in the future we might more reasily reference stacked layers so that there's no more need for specifying the pointer here.	2020-12-23 11:57:26 +01:00
Dragan Dosen	6f7cc11e6d	MEDIUM: xxhash: use the XXH_INLINE_ALL macro to inline all functions This way we make all xxhash functions inline, with implementations being directly included within xxhash.h. Makefile is updated as well, since we don't need to compile and link xxhash.o anymore. Inlining should improve performance on small data inputs.	2020-12-23 06:39:21 +01:00
Dragan Dosen	967e7e79af	MEDIUM: xxhash: use the XXH3 functions to generate 64-bit hashes Replace the XXH64() function calls with the XXH3 variant function XXH3_64bits_withSeed() where possible.	2020-12-23 06:39:21 +01:00
Dragan Dosen	de37443e64	IMPORT: xxhash: update to v0.8.0 that introduces stable XXH3 variant A new XXH3 variant of hash functions shows a noticeable improvement in performance (especially on small data), and also brings 128-bit support, better inlining and streaming capabilities. Performance comparison is available here: https://github.com/Cyan4973/xxHash/wiki/Performance-comparison	2020-12-23 06:39:21 +01:00
Olivier Houchard	63ee281854	MINOR: atomic: don't use ; to separate instruction on aarch64. The assembler on MacOS aarch64 interprets ; as the beginning of comments, so it is not suitable for separating instructions in inline asm. Use \n instead. This should be backported to 2.3, 2.2, 2.1, 2.0 and 1.9.	2020-12-23 01:23:41 +01:00
Willy Tarreau	4f59d38616	MINOR: time: increase the minimum wakeup interval to 60s The MAX_DELAY_MS which is set an upper limit to the poll wait time and force a wakeup this often used to be set to 1 second in order to easily spot and correct time drifts. This was added 12 years ago at an era where virtual machines were starting to become common in server environments while not working particularly well. Nowadays, such issues are not as common anymore, however forcing 64 threads to wake up every single second starts to make the process visible on otherwise idle systems. Let's increase this wakeup interval to one minute. In the worst case it will make idle threads wake every second, which remains low. If this is not sufficient anymore on some systems, another approach would consist in implementing a deep-sleep mode which only triggers after a while and which is always disabled if any time drift is observed.	2020-12-22 10:35:43 +01:00
Christian Ruppert	b67e155895	BUILD: hpack: hpack-tbl-t.h uses VAR_ARRAY but does not include compiler.h This fixes building hpack from contrib, which failed because of the undeclared VAR_ARRAY: make -C contrib/hpack ... cc -O2 -Wall -g -I../../include -fwrapv -fno-strict-aliasing -c -o gen-enc.o gen-enc.c In file included from gen-enc.c:18: ../../include/haproxy/hpack-tbl-t.h:105:23: error: 'VAR_ARRAY' undeclared here (not in a function) 105 \| struct hpack_dte dte[VAR_ARRAY]; /* dynamic table entries */ ... As discussed in the thread below, let's redefine VAR_ARRAY in this file so that it remains self-sustaining: https://www.mail-archive.com/haproxy@formilux.org/msg39212.html	2020-12-22 10:18:07 +01:00
Ilya Shipitsin	f38a01884a	CLEANUP: assorted typo fixes in the code and comments This is 13n iteration of typo fixes	2020-12-21 11:24:48 +01:00
Ilya Shipitsin	af204881a3	BUILD: ssl: fine guard for SSL_CTX_get0_privatekey call SSL_CTX_get0_privatekey is openssl/boringssl specific function present since openssl-1.0.2, let us define readable guard for it, not depending on HA_OPENSSL_VERSION	2020-12-21 11:17:36 +01:00
Willy Tarreau	b1f54925fc	BUILD: plock: remove dead code that causes a warning in gcc 11 As Ilya reported in issue #998, gcc 11 complains about misleading code indentation which is in fact caused by dead assignments to zero after a loop which stops on zero. Let's clean both of these.	2020-12-21 10:27:18 +01:00
Miroslav Zagorac	7f8314c8d1	MINOR: opentracing: add ARGC_OT enum Due to the addition of the OpenTracing filter it is necessary to define ARGC_OT enum. This value is used in the functions fmt_directive() and smp_resolve_args().	2020-12-16 15:49:53 +01:00
Miroslav Zagorac	6deab79d59	MINOR: vars: replace static functions with global ones The OpenTracing filter uses several internal HAProxy functions to work with variables and therefore requires two static local HAProxy functions, var_accounting_diff() and var_clear(), to be declared global. In fact, the var_clear() function was not originally defined as static, but it lacked a declaration.	2020-12-16 14:20:08 +01:00
Ilya Shipitsin	ec60909871	BUILD: SSL: fine guard for SSL_CTX_add_server_custom_ext call SSL_CTX_add_server_custom_ext is openssl specific function present since openssl-1.0.2, let us define readable guard for it, not depending on HA_OPENSSL_VERSION	2020-12-15 16:13:35 +01:00
Willy Tarreau	472125bc04	MINOR: protocol: add a pair of check_events/ignore_events functions at the ctrl layer Right now the connection subscribe/unsubscribe code needs to manipulate FDs, which is not compatible with QUIC. In practice what we need there is to be able to either subscribe or wake up depending on readiness at the moment of subscription. This commit introduces two new functions at the control layer, which are provided by the socket code, to check for FD readiness or subscribe to it at the control layer. For now it's not used.	2020-12-11 17:02:50 +01:00
Willy Tarreau	2ded48dd27	MINOR: connection: make conn_sock_drain() use the control layer's ->drain() Now we don't touch the fd anymore there, instead we rely on the ->drain() provided by the control layer. As such the function was renamed to conn_ctrl_drain().	2020-12-11 16:26:01 +01:00
Willy Tarreau	427c846cc9	MINOR: protocol: add a ->drain() function at the connection control layer This is what we need to drain pending incoming data from an connection. The code was taken from conn_sock_drain() without the connection-specific stuff. It still takes a connection for now for API simplicity.	2020-12-11 16:26:00 +01:00
Willy Tarreau	586f71b43f	REORG: connection: move the socket iocb (conn_fd_handler) to sock.c conn_fd_handler() is 100% specific to socket code. It's about time it moves to sock.c which manipulates socket FDs. With it comes conn_fd_check() which tests for the socket's readiness. The ugly connection status check at the end of the iocb was moved to an inlined function in connection.h so that if we need it for other socket layers it's not too hard to reuse. The code was really only moved and not changed at all.	2020-12-11 16:26:00 +01:00
Willy Tarreau	827fee7406	MINOR: connection: remove sock-specific code from conn_sock_send() The send() loop present in this function and the error handling is already present in raw_sock_from_buf(). Let's rely on it instead and stop touching the FD from this place. The send flag was changed to use a more agnostic CO_SFL_*. The name was changed to "conn_ctrl_send()" to remind that it's meant to be used to send at the lowest level.	2020-12-11 16:25:11 +01:00
Willy Tarreau	3a9e56478e	CLEANUP: connection: remove the unneeded fd_stop_{recv,send} on read0/shutw These are two other areas where this fd_stop_recv()/fd_stop_send() makes no sense anymore. Both happen by definition while the FD is not subscribed, since nowadays it's subscribed after failing recv()/send(), in which case we cannot close.	2020-12-11 13:56:12 +01:00
Willy Tarreau	3ec094b09d	CLEANUP: remove the unused fd_stop_send() in conn_xprt_shutw{,_hard}() These functions used to disable polling for writes when shutting down but this is no longer used as it still happens later when closing if the connection was subscribed to FD events. Let's just remove this fake and undesired dependency on the FD layer.	2020-12-11 13:49:19 +01:00
Amaury Denoyelle	8d22823ade	MEDIUM: http_act: define set-timeout server/tunnel action Add a new http-request action 'set-timeout [server/tunnel]'. This action can be used to update the server or tunnel timeout of a stream. It takes two parameters, the timeout name to update and the new timeout value. This rule is only valid for a proxy with backend capabilities. The timeout value cannot be null. A sample expression can also be used instead of a plain value.	2020-12-11 12:01:07 +01:00
Amaury Denoyelle	fb50443517	MEDIUM: stream: support a dynamic tunnel timeout Allow the modification of the tunnel timeout on the stream side. Use a new field in the stream for the tunnel timeout. It is initialized by the tunnel timeout from backend unless it has already been set by a set-timeout tunnel rule.	2020-12-11 12:01:07 +01:00
Amaury Denoyelle	b715078821	MINOR: stream: prepare the hot refresh of timeouts Define a stream function to allow to update the timeouts. This commit is in preparation for the support of dynamic timeouts with the set-timeout rule.	2020-12-11 12:01:07 +01:00
Amaury Denoyelle	5a9fc2d10f	MINOR: action: define enum for timeout type of the set-timeout rule This enum is used to specify the timeout targetted by a set-timeout rule.	2020-12-11 12:01:07 +01:00
Willy Tarreau	343d0356a5	CLEANUP: connection: remove the unused conn_{stop,cond_update}_polling() These functions are not used anymore and were quite confusing given that their names reflected their original role and not the current ones. Let's kill them before they inspire anyone.	2020-12-11 11:21:53 +01:00
Willy Tarreau	6aee5b9a4c	MINOR: connection: implement cs_drain_and_close() We had cs_close() which forces a CS_SHR_RESET mode on the read side, and due to this there are a few call places in the checks which perform a manual call to conn_sock_drain() before calling cs_close(). This is absurd by principle, and it can be counter-productive in the case of a mux where this could even cause the opposite of the desired effect by deleting pending frames on the socket before closing. Let's add cs_drain_and_close() which uses the CS_SHR_DRAIN mode to prepare this.	2020-12-11 11:04:51 +01:00
Willy Tarreau	29885f0308	MINOR: udp: export udp_suspend_receiver() and udp_resume_receiver() QUIC will rely on UDP at the receiver level, and will need these functions to suspend/resume the receivers. In the future, protocol chaining may simplify this.	2020-12-08 18:10:18 +01:00
Willy Tarreau	c14e7ae744	MINOR: connection: use the control layer's init/close In conn_ctrl_init() and conn_ctrl_close() we now use the control layer's functions instead of manipulating the FD directly. This is safe since the control layer is always present when done. Note that now we also adjust the flag before calling the function to make things cleaner in case such a layer would need to call the same functions again for any reason.	2020-12-08 15:53:45 +01:00
Willy Tarreau	de471c4655	MINOR: protocol: add a set of ctrl_init/ctrl_close methods for setup/teardown Currnetly conn_ctrl_init() does an fd_insert() and conn_ctrl_close() does an fd_delete(). These are the two only short-term obstacles against using a non-fd handle to set up a connection. Let's have pur these into the protocol layer, along with the other connection-level stuff so that the generic connection code uses them instead. This will allow to define new ones for other protocols (e.g. QUIC). Since we only support regular sockets at the moment, the code was placed into sock.c and shared with proto_tcp, proto_uxst and proto_sockpair.	2020-12-08 15:50:56 +01:00
Willy Tarreau	b366c9a59a	CLEANUP: protocol: group protocol struct members by usage For the sake of an improved readability, let's group the protocol field members according to where they're supposed to be defined: - connection layer (note: for now even UDP needs one) - binding layer - address family - socket layer Nothing else was changed.	2020-12-08 14:58:24 +01:00
Willy Tarreau	b9b2fd7cf4	MINOR: protocol: export protocol definitions The various protocols were made static since there was no point in exporting them in the past. Nowadays with QUIC relying on UDP we'll significantly benefit from UDP being exported and more generally from being able to declare some functions as being the same as other protocols'. In an ideal world it should not be these protocols which should be exported, but the intermediary levels: - socket layer (sock.c only right now), already exported as functions but nothing structured at the moment ; - family layer (sock_inet, sock_unix, sockpair etc): already structured and exported - binding layer (the part that relies on the receiver): currently fused within the protocol - connectiong layer (the part that manipulates connections): currently fused within the protocol - protocol (connection's control): shouldn't need to be exposed ultimately once the elements above are in an easily sharable way.	2020-12-08 14:54:08 +01:00
Willy Tarreau	f9ad06cb26	MINOR: protocol: remove the redundant ->sock_domain field This field used to be needed before commit `2b5e0d8b6` ("MEDIUM: proto_udp: replace last AF_CUST_UDP* with AF_INET*") as it was used as a protocol entry selector. Since this commit it's always equal to the socket family's value so it's entirely redundant. Let's remove it now to simplify the protocol definition a little bit.	2020-12-08 12:13:54 +01:00
Christopher Faulet	16df178b6e	BUG/MEDIUM: stream: Xfer the input buffer to a fully created stream The input buffer passed as argument to create a new stream must not be transferred when the request channel is initialized because the channel flags are not set at this stage. In addition, the API is a bit confusing regarding the buffer owner when an error occurred. The caller remains the owner, but reading the code it is not obvious. So, first of all, to avoid any ambiguities, comments are added on the calling chain to make it clear. The buffer owner is the caller if any error occurred. And the ownership is transferred to the stream on success. Then, to make things simple, the ownership is transferred at the end of stream_new(), in case of success. And the input buffer is updated to point on BUF_NULL. Thus, in all cases, if the caller try to release it calling b_free() on it, it is not a problem. Of course, it remains the caller responsibility to release it on error. The patch fixes a bug introduced by the commit `26256f86e` ("MINOR: stream: Pass an optional input buffer when a stream is created"). No backport is needed.	2020-12-04 17:15:03 +01:00
Willy Tarreau	d1f250f87b	MINOR: listener: now use a generic add_listener() function With the removal of the family-specific port setting, all protocol had exactly the same implementation of ->add(). A generic one was created with the name "default_add_listener" so that all other ones can now be removed. The API was slightly adjusted so that the protocol and the listener are passed instead of the listener and the port. Note that all protocols continue to provide this ->add() method instead of routinely calling default_add_listener() from create_listeners(). This makes sure that any non-standard protocol will still be able to intercept the listener addition if needed. This could be backported to 2.3 along with the few previous patches on listners as a pure code cleanup.	2020-12-04 15:08:00 +01:00
Willy Tarreau	73bed9ff13	MINOR: protocol: add a ->set_port() helper to address families At various places we need to set a port on an IPv4 or IPv6 address, and it requires casts that are easy to get wrong. Let's add a new set_port() helper to the address family to assist in this. It will be directly accessible from the protocol and will make the operation seamless. Right now this is only implemented for sock_inet as other families do not need a port.	2020-12-04 15:08:00 +01:00
Christopher Faulet	6ad06066cd	CLEANUP: connection: Remove CS_FL_READ_PARTIAL flag Since the recent refactoring of the H1 multiplexer, this flag is no more used. Thus it is removed.	2020-12-04 14:41:49 +01:00
Christopher Faulet	da831fa068	CLEANUP: http-ana: Remove TX_WAIT_NEXT_RQ unsued flag This flags is now unused. It was used in REQ_WAIT_HTTP analyser, when a stream was waiting for a request, to set the keep-alive timeout or to avoid to send HTTP errors to client.	2020-12-04 14:41:49 +01:00
Christopher Faulet	2afd874704	CLEANUP: htx: Remove HTX_FL_UPGRADE unsued flag Now the H1 to H2 upgrade is handled before the stream creation. HTX_FL_UPGRADE flag is now unused.	2020-12-04 14:41:49 +01:00
Christopher Faulet	4c8ad84232	MINOR: mux: Add a ctl parameter to get the exit status of the multiplexers The ctl param MUX_EXIT_STATUS can be request to get the exit status of a multiplexer. For instance, it may be an HTTP status code or an H2 error. For now, 0 is always returned. When the mux h1 will be able to return HTTP errors itself, this ctl param will be used to get the HTTP status code from the logs. the mux_exit_status enum has been created to map internal mux exist status to generic one. Thus there is 5 possible status for now: success, invalid error, timeout error, internal error and unknown.	2020-12-04 14:41:49 +01:00
Christopher Faulet	7d0c19e82d	MINOR: session: Add functions to increase http values of tracked counters cumulative numbers of http request and http errors of counters tracked at the session level and their rates can now be updated at the session level thanks to two new functions. These functions are not used for now, but it will be called to keep tracked counters up-to-date if an error occurs before the stream creation.	2020-12-04 14:41:49 +01:00
Christopher Faulet	84600631cd	MINOR: stick-tables: Add functions to update some values of a tracked counter The cumulative numbers of http requests, http errors, bytes received and sent and their respective rates for a tracked counters are now updated using specific stream independent functions. These functions are used by the stream but the aim is to allow the session to do so too. For now, there is no reason to perform these updates from the session, except from the mux-h2 maybe. But, the mux-h1, on the frontend side, will be able to return some errors to the client, before the stream creation. In this case, it will be mandatory to update counters tracked at the session level.	2020-12-04 14:41:49 +01:00
Christopher Faulet	26256f86e1	MINOR: stream: Pass an optional input buffer when a stream is created It is now possible to set the buffer used by the channel request buffer when a stream is created. It may be useful if input data are already received, instead of waiting the first call to the mux rcv_buf() callback. This change is mandatory to support H1 connection with no stream attached. For now, the multiplexers don't pass any buffer. BUF_NULL is thus used to call stream_create_from_cs().	2020-12-04 14:41:48 +01:00
Christopher Faulet	afc02a4436	MINOR: muxes: Remove get_cs_info callback function now useless This callback function was only defined by the mux-h1. But it has been removed in the previous commit because it is unused now. So, we can do a step forward removing the callback function from the mux definition and the cs_info structure.	2020-12-04 14:41:48 +01:00
Christopher Faulet	d517396f8e	MINOR: session: Add the idle duration field into the session The idle duration between two streams is added to the session structure. It is not necessarily pertinent on all protocols. In fact, it is only defined for H1 connections. It is the duration between two H1 transactions. But the .get_cs_info() callback function on the multiplexers only exists because this duration is missing at the session level. So it is a simplification opportunity for a really low cost. To reduce the cost, a hole in the session structure is filled by moving .srv_list field at the end of the structure.	2020-12-04 14:41:48 +01:00
Thierry Fournier	c749259dff	MINOR: lua-thread: Store each function reference and init reference in array The goal is to allow execution of one main lua state per thread. The array introduces storage of one reference per thread, because each lua state can have different reference id for a same function. A function returns the preferred state id according to configuration and current thread id.	2020-12-02 21:53:16 +01:00
Thierry Fournier	021d986ecc	MINOR: lua-thread: Replace state_from by state_id The goal is to allow execution of one main lua state per thread. "state_from" is a pointer to the parent lua state. "state_id" is the index of the parent state id in the reference lua states array. "state_id" is better because the lock is a "== 0" test which is quick than pointer comparison. In other way, the state_id index could index other things the the Lua state concerned. I think to the function references.	2020-12-02 21:53:16 +01:00
Thierry Fournier	62a22aa23f	MINOR: lua-thread: Replace "struct hlua_function" allocation by dedicated function The goal is to allow execution of one main lua state per thread. This function will initialize the struct with other things than 0. With this function helper, the initialization is centralized and it prevents mistakes. This patch also keeps a reference to each declared function in a list. It will be useful in next patches to control consistency of declared references.	2020-12-02 21:53:16 +01:00
Thierry Fournier	75fc02956b	MINOR: lua-thread: make hlua_ctx_init() get L from its caller The goal is to allow execution of one main lua state per thread. The function hlua_ctx_init() now gets the original lua state from its caller. This allows the initialisation of lua_thread (coroutines) from any master lua state. The parent lua state is stored in the hlua struct. This patch is a temporary transition, it will be modified later.	2020-12-02 21:53:16 +01:00
Thierry Fournier	ad5345fed7	MINOR: lua-thread: Replace embedded struct hlua_function by a pointer The goal is to allow execution of one main lua state per thread. Because this struct will be filled after the configuration parser, we cannot copy the content. The actual state of the Haproxy code doesn't justify this change, it is an update preparing next steps.	2020-12-02 21:53:16 +01:00
Thierry Fournier	a51a1fd174	MINOR: cli: add a function to look up a CLI service description This function will be useful to check if the keyword is already registered. Also add a define for the max number of args. This will be needed by a next patch to fix a bug and will have to be backported.	2020-12-02 09:45:18 +01:00
Thierry Fournier	87e539906b	MINOR: actions: add a function returning a service pointer from its name This function simply calls action_lookup() on the private service_keywords, to look up a service name. This will be used to detect double registration of a same service from Lua. This will be needed by a next patch to fix a bug and will have to be backported.	2020-12-02 09:45:18 +01:00
Thierry Fournier	7a71a6d9d2	MINOR: actions: Export actions lookup functions These functions will be useful to check if a keyword is already registered. This will be needed by a next patch to fix a bug, and will need to be backported.	2020-12-02 09:45:18 +01:00
Willy Tarreau	a1f12746b1	MINOR: traces: add a new level "error" below the "user" level Sometimes it would be nice to be able to only trace abnormal events such as protocol errors. Let's add a new "error" level below the "user" level for this. This will allow to add TRACE_ERROR() at various error points and only see them.	2020-12-01 10:25:20 +01:00
Maciej Zdeb	fcdfd857b3	MINOR: log: Logging HTTP path only with %HPO This patch adds a new logging variable '%HPO' for logging HTTP path only (without query string) from relative or absolute URI. For example: log-format "hpo=%HPO hp=%HP hu=%HU hq=%HQ" GET /r/1 HTTP/1.1 => hpo=/r/1 hp=/r/1 hu=/r/1 hq= GET /r/2?q=2 HTTP/1.1 => hpo=/r/2 hp=/r/2 hu=/r/2?q=2 hq=?q=2 GET http://host/r/3 HTTP/1.1 => hpo=/r/3 hp=http://host/r/3 hu=http://host/r/3 hq= GET http://host/r/4?q=4 HTTP/1.1 => hpo=/r/4 hp=http://host/r/4 hu=http://host/r/4?q=4 hq=?q=4	2020-12-01 09:32:44 +01:00
Emeric Brun	0237c4e3f5	BUG/MEDIUM: local log format regression. Since 2.3 default local log format always adds hostame field. This behavior change was due to log/sink re-work, because according to rfc3164 the hostname field is mandatory. This patch re-introduce a legacy "local" format which is analog to rfc3164 but with hostname stripped. This is the new default if logs are generated by haproxy. To stay compliant with previous configurations, the option "log-send-hostname" acts as if the default format is switched to rfc3164. This patch addresses the github issue #963 This patch should be backported in branches >= 2.3.	2020-12-01 06:58:42 +01:00
Willy Tarreau	4d6c594998	BUG/MEDIUM: task: close a possible data race condition on a tasklet's list link In issue #958 Ashley Penney reported intermittent crashes on AWS's ARM nodes which would not happen on x86 nodes. After investigation it turned out that the Neoverse N1 CPU cores used in the Graviton2 CPU are much more aggressive than the usual Cortex A53/A72/A55 or any x86 regarding memory ordering. The issue that was triggered there is that if a tasklet_wakeup() call is made on a tasklet scheduled to run on a foreign thread and that tasklet is just being dequeued to be processed, there can be a race at two places: - if MT_LIST_TRY_ADDQ() happens between MT_LIST_BEHEAD() and LIST_SPLICE_END_DETACHED() if the tasklet is alone in the list, because the emptiness tests matches ; - if MT_LIST_TRY_ADDQ() happens during LIST_DEL_INIT() in run_tasks_from_lists(), then depending on how LIST_DEL_INIT() ends up being implemented, it may even corrupt the adjacent nodes while they're being reused for the in-tree storage. This issue was introduced in 2.2 when support for waking up remote tasklets was added. Initially the attachment of a tasklet to a list was enough to know its status and this used to be stable information. Now it's not sufficient to rely on this anymore, thus we need to use a different information. This patch solves this by adding a new task flag, TASK_IN_LIST, which is atomically set before attaching a tasklet to a list, and is only removed after the tasklet is detached from a list. It is checked by tasklet_wakeup_on() so that it may only be done while the tasklet is out of any list, and is cleared during the state switch when calling the tasklet. Note that the flag is not set for pure tasks as it's not needed. However this introduces a new special case: the function tasklet_remove_from_tasklet_list() needs to keep both states in sync and cannot check both the state and the attachment to a list at the same time. This function is already limited to being used by the thread owning the tasklet, so in this case the test remains reliable. However, just like its predecessors, this function is wrong by design and it should probably be replaced with a stricter one, a lazy one, or be totally removed (it's only used in checks to avoid calling a possibly scheduled event, and when freeing a tasklet). Regardless, for now the function exists so the flag is removed only if the deletion could be done, which covers all cases we're interested in regarding the insertion. This removal is safe against a concurrent tasklet_wakeup_on() since MT_LIST_DEL() guarantees the atomic test, and will ultimately clear the flag only if the task could be deleted, so the flag will always reflect the last state. This should be carefully be backported as far as 2.2 after some observation period. This patch depends on previous patch "MINOR: task: remove __tasklet_remove_from_tasklet_list()".	2020-11-30 18:17:59 +01:00
Willy Tarreau	2da4c316c2	MINOR: task: remove __tasklet_remove_from_tasklet_list() This function is only used at a single place directly within the scheduler in run_tasks_from_lists() and it really ought not be called by anything else, regardless of what its comment says. Let's delete it, move the two lines directly into the call place, and take this opportunity to factor the atomic decrement on tasks_run_queue. A comment was added on the remaining one tasklet_remove_from_tasklet_list() to mention the risks in using it.	2020-11-30 18:17:44 +01:00
Willy Tarreau	a868c2920b	MINOR: task: remove tasklet_insert_into_tasklet_list() This function is only called at a single place and adds more confusion than it removes. It also makes one think it could be used outside of the scheduler while it must absolutely not. Let's just move its two lines to the call place, making the code more readable there. In addition this clearly shows that the preliminary LIST_INIT() is useless since the entry is immediately overwritten.	2020-11-30 18:17:44 +01:00
Olivier Houchard	1f05324cbe	BUG/MEDIUM: lists: Lock the element while we check if it is in a list. In MT_LIST_TRY_ADDQ() and MT_LIST_TRY_ADD() we can't just check if the element is already in a list, because there's a small race condition, it could be added between the time we checked, and the time we actually set its next and prev, so we have to lock it first. This is required to address issue #958. This should be backported to 2.3, 2.2 and 2.1.	2020-11-30 18:17:29 +01:00
Your Name	1e237d037b	MINOR: plock: use an ARMv8 instruction barrier for the pause instruction As suggested by @AGSaidi in issue #958, on ARMv8 its convenient to use an "isb" instruction in pl_cpu_relax() to improve fairness. Without it I've met a few watchdog conditions on valid locks with 16 threads, indicating that some threads couldn't manage to get it in 2 seconds. I never happened again with it. In addition, the performance increased by slightly more than 5% thanks to the reduced contention. This should be backported as far as 2.2, possibly even 2.0.	2020-11-29 14:53:33 +01:00
Christopher Faulet	97b7bdfcf7	REORG: tcpcheck: Move check option parsing functions based on tcp-check The parsing of the check options based on tcp-check rules (redis, spop, smtp, http...) are moved aways from check.c. Now, these functions are placed in tcpcheck.c. These functions are only related to the tcpcheck ruleset configured on a proxy and not to the health-check attached to a server.	2020-11-27 10:30:23 +01:00
Christopher Faulet	bb9fb8b7f8	MINOR: config: Deprecate and ignore tune.chksize global option This option is now ignored because I/O check buffers are now allocated using the buffer pool. Thus, it is marked as deprecated in the documentation and ignored during the configuration parsing. The field is also removed from the global structure. Because this option is ignored since a recent fix, backported as fare as 2.2, this patch should be backported too. Especially because it updates the documentation.	2020-11-27 10:30:23 +01:00
Christopher Faulet	b381a505c1	BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool Historically, the input and output buffers of a check are allocated by hand during the startup, with a specific size (not necessarily the same than other buffers). But since the recent refactoring of the checks to rely exclusively on the tcp-checks and to use the underlying mux layer, this part is totally buggy. Indeed, because these buffers are now passed to a mux, they maybe be swapped if a zero-copy is possible. In fact, for now it is only possible in h2_rcv_buf(). Thus the bug concretely only exists if a h2 health-check is performed. But, it is a latent bug for other muxes. Another problem is the size of these buffers. because it may differ for the other buffer size, it might be source of bugs. Finally, for configurations with hundreds of thousands of servers, having 2 buffers per check always allocated may be an issue. To fix the bug, we now allocate these buffers when required using the buffer pool. Thus not-running checks don't waste memory and muxes may swap them if possible. The only drawback is the check buffers have now always the same size than buffers used by the streams. This deprecates indirectly the "tune.chksize" global option. In addition, the http-check regtest have been update to perform some h2 health-checks. Many thanks to @VigneshSP94 for its help on this bug. This patch should solve the issue #936. It relies on the commit "MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main". Both must be backport as far as 2.2. bla	2020-11-27 10:29:41 +01:00
Remi Tricot-Le Breton	3d08236cb3	MINOR: cache: Prepare helper functions for Vary support The Vary functionality is based on a secondary key that needs to be calculated for every request to which a server answers with a Vary header. The Vary header, which can only be found in server responses, determines which headers of the request need to be taken into account in the secondary key. Since we do not want to have to store all the headers of the request until we have the response, we will pre-calculate as many sub-hashes as there are headers that we want to manage in a Vary context. We will only focus on a subset of headers which are likely to be mentioned in a Vary response (accept-encoding and referer for now). Every managed header will have its own normalization function which is in charge of transforming the header value into a core representation, more robust to insignificant changes that could exist between multiple clients. For instance, two accept-encoding values mentioning the same encodings but in different orders should give the same hash. This patch adds a function that parses a Vary header value and checks if all the values belong to our supported subset. It also adds the normalization functions for our two headers, as well as utility functions that can prebuild a secondary key for a given request and transform it into an actual secondary key after the vary signature is determined from the response.	2020-11-24 16:52:57 +01:00
Christopher Faulet	401e6dbff3	BUG/MAJOR: filters: Always keep all offsets up to date during data filtering When at least one data filter is registered on a channel, the offsets of all filters must be kept up to date. For data filters but also for others. It is safer to do it in that way. Indirectly, this patch fixes 2 hidden bugs revealed by the commit `22fca1f2c` ("BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering"). The first one, the worst of both, happens at the end of http filtering when at least one data filtered is registered on the channel. We call the http_end() callback function on the filters, when defined, to finish the http filtering. But it is performed for all filters. Before the commit `22fca1f2c`, the only risk was to call the http_end() callback function unexpectedly on a filter. Now, we may have an overflow on the offset variable, used at the end to forward all filtered data. Of course, from the moment we forward an arbitrary huge amount of data, all kinds of bad things may happen. So offset computation is performed for all filters and http_end() callback function is called only for data filters. The other one happens when a data filter alter the data of a channel, it must update the offsets of all previous filters. But the offset of non-data filters must be up to date, otherwise, here too we may have an integer overflow. Another way to fix these bugs is to always ignore non-data filters from the offsets computation. But this patch is safer and probably easier to maintain. This patch must be backported in all versions where the above commit is. So as far as 2.0.	2020-11-24 14:17:32 +01:00
Ilya Shipitsin	5bfe66366c	BUILD: SSL: do not "update" BoringSSL version equivalent anymore we have added all required fine guarding, no need to reduce BoringSSL version back to 1.1.0 anymore, we do not depend on it	2020-11-24 09:54:44 +01:00
Ilya Shipitsin	f04a89c549	CLEANUP: remove unused function "ssl_sock_is_ckch_valid" "ssl_sock_is_ckch_valid" is not used anymore, let us remove it	2020-11-24 09:54:44 +01:00
Julien Pivotto	2de240a676	MINOR: stream: Add level 7 retries on http error 401, 403 Level-7 retries are only possible with a restricted number of HTTP return codes. While it is usually not safe to retry on 401 and 403, I came up with an authentication backend which was not synchronizing authentication of users. While not perfect, being allowed to also retry on those return codes is really helpful and acts as a hotfix until we can fix the backend. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-11-23 09:33:14 +01:00
Maciej Zdeb	ebdd4c55da	MINOR: http_act: Add -m flag for del-header name matching method This patch adds -m flag which allows to specify header name matching method when deleting headers from http request/response. Currently beg, end, sub, str and reg are supported. This is related to GitHub issue #909	2020-11-21 15:54:30 +01:00
Willy Tarreau	3aab17bd56	BUG/MAJOR: connection: reset conn->owner when detaching from session list Baptiste reported a new crash affecting 2.3 which can be triggered when using H2 on the backend, with http-reuse always and with a tens of clients doing close only. There are a few combined cases which cause this to happen, but each time the issue is the same, an already freed session is dereferenced in session_unown_conn(). Two cases were identified to cause this: - a connection referencing a session as its owner, which is detached from the session's list and is destroyed after this session ends. The test on conn->owner before calling session_unown_conn() is not sufficent as the pointer is not null but is not valid anymore. - a connection that never goes idle and that gets killed form the mux, where session_free() is called first, then conn_free() calls session_unown_conn() which scans the just freed session for older connections. This one is only triggered with DEBUG_UAF The reason for this session to be present here is that it's needed during the connection setup, to be passed to conn_install_mux_be() to mux->init() as the owning session, but it's never deleted aftrewards. Furthermore, even conn_session_free() doesn't delete this pointer after freeing the session that lies there. Both do definitely result in a use-after-free that's more easily triggered under DEBUG_UAF. This patch makes sure that the owner is always deleted after detaching or killing the session. However it is currently not possible to clear the owner right after a synchronous init because the proxy protocol apparently needs it (a reg test checks this), and if we leave it past the connection setup with the session not attached anywhere, it's hard to catch the right moment to detach it. This means that the session may remain in conn->owner as long as the connection has never been added to nor removed from the session's idle list. Given that this patch needs to remain simple enough to be backported, instead it adds a workaround in session_unown_conn() to detect that the element is already not attached anywhere. This fix absolutely requires previous patch "CLEANUP: connection: do not use conn->owner when the session is known" otherwise the situation will be even worse, as some places used to rely on conn->owner instead of the session. The fix could theorically be backported as far as 1.8. However, the code in this area has significantly changed along versions and there are more risks of breaking working stuff than fixing real issues there. The issue was really woken up in two steps during 2.3-dev when slightly reworking the idle conns with commit `08016ab82` ("MEDIUM: connection: Add private connections synchronously in session server list") and when adding support for storing used H2 connections in the session and adding the necessary call to session_unown_conn() in the muxes. But the same test managed to crash 2.2 when built in DEBUG_UAF and patched like this, proving that we used to already leave dangling pointers behind us: \| diff --git a/include/haproxy/connection.h b/include/haproxy/connection.h \| index f8f235c1a..dd30b5f80 100644 \| --- a/include/haproxy/connection.h \| +++ b/include/haproxy/connection.h \| @@ -458,6 +458,10 @@ static inline void conn_free(struct connection conn) \| sess->idle_conns--; \| session_unown_conn(sess, conn); \| } \| + else { \| + struct session sess = conn->owner; \| + BUG_ON(sess && sess->origin != &conn->obj_type); \| + } \| \| sockaddr_free(&conn->src); \| sockaddr_free(&conn->dst); It's uncertain whether an existing code path there can lead to dereferencing conn->owner when it's bad, though certain suspicious memory corruption bugs make one think it's a likely candidate. The patch should not be hard to adapt there. Backports to 2.1 and older are left to the appreciation of the person doing the backport. A reproducer consists in this: global nbthread 1 listen l bind :9000 mode http http-reuse always server s 127.0.0.1:8999 proto h2 frontend f bind :8999 proto h2 mode http http-request return status 200 Then this will make it crash within 2-3 seconds: $ h1load -e -r 1 -c 10 http://0:9000/ If it does not, it might be that DEBUG_UAF was not used (it's harder then) and it might be useful to restart.	2020-11-21 15:29:22 +01:00
Willy Tarreau	38b4d2eb22	CLEANUP: connection: do not use conn->owner when the session is known At a few places we used to rely on conn->owner to retrieve the session while the session is already known. This is not correct because at some of these points the reason the connection's owner was still the session (instead of NULL) is a mistake. At one place a comparison is even made between the session and conn->owner assuming it's valid without checking if it's NULL. Let's clean this up to use the session all the time. Note that this will be needed for a forthcoming fix and will have to be backported.	2020-11-21 15:29:22 +01:00
Ilya Shipitsin	f34ed0b74c	BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES HAVE_SSL_CTX_SET_CIPHERSUITES is newly defined macro set in openssl-compat.h, which helps to identify ssl libs (currently OpenSSL-1.1.1 only) that supports TLS13 cipersuites manipulation on TLS13 context	2020-11-21 11:04:36 +01:00
Ilya Shipitsin	bdec3ba796	BUILD: ssl: use SSL_MODE_ASYNC macro instead of OPENSSL_VERSION	2020-11-19 19:59:32 +01:00
William Dauchy	f63704488e	MEDIUM: cli/ssl: configure ssl on server at runtime in the context of a progressive backend migration, we want to be able to activate SSL on outgoing connections to the server at runtime without reloading. This patch adds a `set server ssl` command; in order to allow that: - add `srv_use_ssl` to `show servers state` command for compatibility, also update associated parsing - when using default-server ssl setting, and `no-ssl` on server line, init SSL ctx without activating it - when triggering ssl API, de/activate SSL connections as requested - clean ongoing connections as it is done for addr/port changes, without checking prior server state example config: backend be_foo default-server ssl server srv0 127.0.0.1:6011 weight 1 no-ssl show servers state: 5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - -1 where srv0 can switch to ssl later during the runtime: set server be_foo/srv0 ssl on 5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - 1 Also update existing tests and create a new one. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-11-18 17:22:28 +01:00
Christopher Faulet	83fefbcdff	MINOR: init: Fix the prototype for per-thread free callbacks Functions registered to release memory per-thread have no return value. But the registering function and the function pointer in per_thread_free_fct structure specify it should return an integer. This patch fixes it. This patch may be backported as far as 2.0.	2020-11-13 16:26:10 +01:00
Amaury Denoyelle	7f8f6cb926	BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one Define a per-thread counters allocated with the greatest size of any stat module counters. This variable is named trash_counters. When using a proxy without allocated counters, return the trash counters from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent segfault. This is useful for all the proxies used internally and not belonging to the global proxy list. As these objects does not appears on the stat report, it does not matter to use the dummy counters. For this fix to be functional, the extra counters are explicitly initialized to NULL on proxy/server/listener init functions. Most notably, the crash has already been detected with the following vtc: - reg-tests/lua/txn_get_priv.vtc - reg-tests/peers/tls_basic_sync.vtc - reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc There is probably other parts that may be impacted (SPOE for example). This bug was introduced in the current release and do not need to be backported. The faulty commits are "MINOR: ssl: count client hello for stats" and "MINOR: ssl: add counters for ssl sessions".	2020-11-12 15:16:05 +01:00
Remi Tricot-Le Breton	cc9bf2e5fe	MEDIUM: cache: Change caching conditions Do not cache responses that do not have an explicit expiration time (s-maxage or max-age Cache-Control directives or Expires header) or a validator (ETag or Last-Modified headers) anymore, as suggested in RFC 7234#3. The TX_FLAG_IGNORE flag is used instead of the TX_FLAG_CACHEABLE so as not to change the behavior of the checkcache option.	2020-11-12 11:22:05 +01:00
Christopher Faulet	a66adf41ea	MINOR: http-htx: Add understandable errors for the errorfiles parsing No details are provided when an error occurs during the parsing of an errorfile, Thus it is a bit hard to diagnose where the problem is. Now, when it happens, an understandable error message is reported. This patch is not a bug fix in itself. But it will be required to change an fatal error into a warning in last stable releases. Thus it must be backported as far as 2.0.	2020-11-06 09:13:58 +01:00
Willy Tarreau	38d41996c1	MEDIUM: pattern: turn the pattern chaining to single-linked list It does not require heavy deletion from the expr anymore, so we can now turn this to a single-linked list since most of the time we want to delete all instances of a given pattern from the head. By doing so we save 32 bytes of memory per pattern. The pat_unlink_from_head() function was adjusted accordingly.	2020-11-05 19:27:09 +01:00
Willy Tarreau	94b9abe200	MINOR: pattern: add pat_ref_purge_older() to purge old entries This function will be usable to purge at most a specified number of old entries from a reference. Entries are declared old if their generation number is in the past compared to the one passed in argument. This will ease removal of early entries when new ones have been appended. We also call malloc_trim() when available, at the end of the series, because this is one place where there is a lot of memory to save. Reloads of 1M IP addresses used in an ACL made the process grow up to 1.7 GB RSS after 10 reloads and roughly stabilize there without this call, versus only 260 MB when the call is present. Sadly there is no direct equivalent for jemalloc, which stabilizes around 800MB-1GB.	2020-11-05 19:27:09 +01:00
Willy Tarreau	1a6857b9c1	MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation pat_ref_load() basically combines pat_ref_append() and pat_ref_commit(). It's very similar to pat_ref_add() except that it also allows to set the generation ID and the line number. pat_ref_add() was modified to directly rely on it to avoid code duplication. Note that a previous declaration of pat_ref_load() was removed as it was just a leftover of an earlier incarnation of something possibly similar, so no existing functionality was changed here.	2020-11-05 19:27:09 +01:00
Willy Tarreau	0439e5eeb4	MINOR: pattern: add pat_ref_commit() to commit a previously inserted element This function will be used after a successful pat_ref_append() to propagate the pattern to all use places (including parsing and indexing). On failure, it will entirely roll back all insertions and free the pattern itself. It also preserves the generation number so that it is convenient for use in association with pat_ref_append(). pat_ref_add() was modified to rely on it instead of open-coding the insertion and roll-back.	2020-11-05 19:27:09 +01:00
Willy Tarreau	29947745b5	MINOR: pattern: store a generation number in the reference patterns Right now it's not possible to perform a safe reload because we don't know what patterns were recently added or were already present. This patch adds a generation counter to the reference patterns so that it is possible to know what generation of the reference they were loaded with. A reference now has two generations, the current one, used for all additions, and the next one, allocated to those wishing to update the contents. The generation wraps at 2^32 so comparisons must be made relative to the current position. The idea will be that upon full reload, the caller will first get a new generation ID, will insert all new patterns using it, will then switch the current ID to the new one, and will delete all entries older than the current ID. This has the benefit of supporting chunked updates that remain consistent and that won't block the whole process for ages like pat_ref_reload() currently does.	2020-11-05 19:27:09 +01:00
Willy Tarreau	1fd52f70e5	MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference Till now the only way to remove a known reference was via pat_ref_delete_by_id() which scans the whole list to find a matching pointer. Let's add pat_ref_delete_by_ptr() which takes a valid pointer. It can be called by the function above after the pointer is found, and can also be used to roll back a failed insertion much more efficiently.	2020-11-05 19:27:09 +01:00
Willy Tarreau	a98b2882ac	CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete() These ones are not used anymore, so let's remove them to remove a bit of the complexity. The ACL keyword's delete() function could be removed as well, though most keyword declarations are positional and we have a high risk of introducing a mistake here, so let's not touch the ACL part.	2020-11-05 19:27:09 +01:00
Willy Tarreau	f1c0892aa6	MINOR: pattern: remerge the list and tree deletion functions pat_del_tree_gen() was already chained onto pat_del_list_gen() to deal with remaining cases, so let's complete the merge and have a generic pattern deletion function acting on the reference and taking care of reliably removing all elements.	2020-11-05 19:27:09 +01:00
Willy Tarreau	78777ead32	MEDIUM: pattern: change the pat_del_* functions to delete from the references This is the next step in speeding up entry removal. Now we don't scan the whole lists or trees for elements pointing to the target reference, instead we start from the reference and delete all linked patterns. This simplifies some delete functions since we don't need anymore to delete multiple times from an expression since all nodes appear after the reference element. We can now have one generic list and one generic tree deletion function. This required the replacement of pattern_delete() with an open-coded version since we now need to lock all expressions first before proceeding. This means there is a high risk of lock inversion here but given that the expressions are always scanned in the same order from the same head, this must not happen. Now deleting first entries is instantaneous, and it's still slow to delete the last ones when looking up their ID since it still requires to look them up by a full scan, but it's already way faster than previously. Typically removing the last 10 IP from a 20M entries ACL with a full-scan each took less than 2 seconds. It would be technically possible to make use of indexed entries to speed up most lookups for removal by value (e.g. IP addresses) but that's for later.	2020-11-05 19:27:09 +01:00
Willy Tarreau	4bdd0a13d6	MEDIUM: pattern: link all final elements from the reference There is a data model issue in the current pattern design that makes pattern deletion extremely expensive: there's no direct way from a reference to access all indexed occurrences. As such, the only way to remove all indexed entries corresponding to a reference update is to scan all expressions's lists and trees to find a link to the reference. While this was possibly OK when map removal was not common and most maps were small, this is not conceivable anymore with GeoIP maps containing 10M+ entries and del-map operations that are triggered from http-request rulesets. This patch introduces two list heads from the pattern reference, one for the objects linked by lists and one for those linked by tree node. Ideally a single list would be enough but the linked elements are too much unrelated to be distinguished at the moment, so we'll need two lists. However for the long term a single-linked list will suffice but for now it's not possible due to the way elements are removed from expressions. As such this patch adds 32 bytes of memory usage per reference plus 16 per indexed entry, but both will be cut in half later. The links are not yet used for deletion, this patch only ensures the list is always consistent.	2020-11-05 19:27:09 +01:00
Willy Tarreau	6d8a68914e	MINOR: pattern: make the delete and prune functions more generic Now we have a single prune() function to act on an expression, and one delete function for the lists and one for the trees. The presence of a pointer in the lists is enough to warrant a free, and we rely on the PAT_SF_REGFREE flag to decide whether to free using free() or regfree().	2020-11-05 19:27:09 +01:00
Willy Tarreau	9b5c8bbc89	MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed Currently we have no way to know how to delete/prune a pattern in a generic way. A pattern doesn't contain its own type so we don't know what function to call. Tree nodes are roughly OK but not lists where regex are possible. Let's add one new bit for sflags at index time to indicate that regex_free() will be needed upon deletion. It's not used for now.	2020-11-05 19:27:08 +01:00
Willy Tarreau	3ee0de1b41	MINOR: pattern: move the update revision to the pat_ref, not the expression It's not possible to uniquely update a single expression without updating the pattern reference, I don't know why we've put the revision in the expression back then, given that it in fact provides an update for a full pattern. Let's move the revision into the reference's head instead.	2020-11-05 19:27:08 +01:00
Willy Tarreau	1d3c7003d9	MINOR: compat: automatically include malloc.h on glibc This is in order to access malloc_trim() which is convenient after clearing huge maps to reclaim memory. When this is detected, we also define HA_HAVE_MALLOC_TRIM.	2020-11-05 19:27:08 +01:00
Baptiste Assmann	e279ca6bbe	MINOR: sample: Add converts to parses MQTT messages This patch implements a couple of converters to validate and extract data from a MQTT (Message Queuing Telemetry Transport) message. The validation consists of a few checks as well as "packet size" validation. The extraction can get any field from the variable header and the payload. This is limited to CONNECT and CONNACK packet types only. All other messages are considered as invalid. It is not a problem for now because only the first packet on each side can be parsed (CONNECT for the client and CONNACK for the server). MQTT 3.1.1 and 5.0 are supported. Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>	2020-11-05 19:27:03 +01:00
Baptiste Assmann	e138dda1e0	MINOR: sample: Add converters to parse FIX messages This patch implements a couple of converters to validate and extract tag value from a FIX (Financial Information eXchange) message. The validation consists in a few checks such as mandatory fields and checksum computation. The extraction can get any tag value based on a tag string or tag id. This patch requires the istend() function. Thus it depends on "MINOR: ist: Add istend() function to return a pointer to the end of the string". Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>	2020-11-05 19:26:30 +01:00
Christopher Faulet	cf26623780	MINOR: ist: Add istend() function to return a pointer to the end of the string istend() is a shortcut to istptr() + istlen().	2020-11-05 19:25:12 +01:00
Willy Tarreau	1db5579bf8	[RELEASE] Released version 2.4-dev0 Released version 2.4-dev0 with the following main changes : - MINOR: version: it's development again. - DOC: mention in INSTALL that it's development again	2020-11-05 17:20:35 +01:00
Willy Tarreau	b9b2ac20f8	MINOR: version: it's development again. This reverts commit `0badabc381`.	2020-11-05 17:18:49 +01:00
Willy Tarreau	0badabc381	MINOR: version: mention that it's stable now This version will be maintained up to around Q1 2022.	2020-11-05 17:00:50 +01:00
Ilya Shipitsin	0aa8c29460	BUILD: ssl: use feature macros for detecting ec curves manipulation support Let us use SSL_CTX_set1_curves_list, defined by OpenSSL, as well as in openssl-compat when SSL_CTRL_SET_CURVES_LIST is present (BoringSSL), for feature detection instead of versions.	2020-11-05 15:08:41 +01:00
Willy Tarreau	5b8af1e30c	MINOR: ssl: define SSL_CTX_set1_curves_list to itself on BoringSSL OpenSSL 1.0.2 and onwards define SSL_CTX_set1_curves_list which is both a function and a macro. OpenSSL 1.0.2 to 1.1.0 define SSL_CTRL_SET_CURVES_LIST as a macro, which disappeared from 1.1.1. BoringSSL only has that one and not the former macro but it does have the function. Let's keep the test on the macro matching the function name by defining the macro to itself when needed.	2020-11-05 15:05:09 +01:00
Willy Tarreau	7e98e28eb0	MINOR: fd: add fd_want_recv_safe() This does the same as fd_want_recv() except that it does check for fd_updt[] to be allocated, as this may be called during early listener initialization. Previously we used to check fd_updt[] before calling fd_want_recv() but this is not correct since it does not update the FD flags. This method will be safer.	2020-11-04 14:22:42 +01:00
Willy Tarreau	9dd7f4fb4b	MINOR: debug: don't count free(NULL) in memstats The mem stats are pretty convenient to spot leaks, except that they count free(NULL) as 1, and the code does actually have quite a number of free(foo) guards where foo is NULL if the object was already freed. Let's just not count these ones so that the stats remain consistent. Now it's possible to compare the strdup()/malloc() and free() and verify they are consistent.	2020-11-03 16:46:48 +01:00
Ilya Shipitsin	04a5a440b8	BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions let us use HAVE_OPENSSL_KEYLOG for feature detection instead of versions	2020-11-03 14:54:15 +01:00
Willy Tarreau	b706a3b4e1	CLEANUP: pattern: remove unused entry "tree" in pattern.val This one might have disappeared since patterns were reworked, but the entry was not removed from the structure, let's do it now.	2020-11-02 11:32:05 +01:00
Willy Tarreau	6bedf151e1	MINOR: pattern: export pat_ref_push() Strangely this one was marked static inline within the file itself. Let's export it.	2020-10-31 13:13:48 +01:00
Willy Tarreau	f4edb72e0a	MINOR: pattern: make pat_ref_append() return the newly added element It's more convenient to return the element than to return just 0 or 1, as the next thing we'll want to do is to act on this element! In addition it was using variable arguments instead of consts, causing some reuse constraints which were also addressed. This doesn't change its use as a boolean, hence why call places were not modified.	2020-10-31 13:13:48 +01:00
Remi Tricot-Le Breton	bb4582cf71	MINOR: ist: Add a case insensitive istmatch function Add a helper function that checks if a string starts with another string while ignoring case.	2020-10-30 13:20:21 +01:00
Willy Tarreau	bd71510024	MINOR: stats: report server's user-configured weight next to effective weight The "weight" column on the stats page is somewhat confusing when using slowstart becaue it reports the effective weight, without being really explicit about it. In some situations the user-configured weight is more relevant (especially with long slowstarts where it's important to know if the configured weight is correct). This adds a new uweight stat which reports a server's user-configured weight, and in a backend it receives the sum of all servers' uweights. In addition it adds the mention of "effective" in a few descriptions for the "weight" column (help and doc). As a result, the list of servers in a backend is now always scanned when dumping the stats. But this is not a problem given that these servers are already scanned anyway and for way heavier processing.	2020-10-23 22:47:30 +02:00
Willy Tarreau	3e32036701	MINOR: stats: also support a "no-maint" show stat modifier "no-maint" is a bit similar to "up" except that it will only hide servers that are in maintenance (or disabled in the configuration), and not those that are enabled but failed a check. One benefit here is to significantly reduce the output of the "show stat" command when using large server-templates containing entries that are not yet provisioned. Note that the prometheus exporter also has such an option which does the exact same.	2020-10-23 18:11:24 +02:00
Willy Tarreau	670119955b	Revert "OPTIM: queue: don't call pendconn_unlink() when the pendconn is not queued" This reverts commit `b7ba1d9011`. Actually this test had already been removed in the past by commit `fac0f645d` ("BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe"), but the condition to reproduce the bug mentioned there was not clear. Now after analysis and a certain dose of code cleanup, things start to appear more obvious. what happens is that if we check the presence of the node in the tree without taking the lock, we can see the NULL at the instant the node is being unlinked by another thread in pendconn_process_next_strm() as part of __pendconn_unlink_prx() or __pendconn_unlink_srv(). Till now there is no issue except that the pendconn is not removed from the queue during this operation and that the task is scheduled to be woken up by pendconn_process_next_strm() with the stream being added to the list of the server's active connections by __stream_add_srv_conn(). The first thread finishes faster and gets back to stream_free() faster than the second one sets the srv_conn on the stream, so stream_free() skips the s->srv_conn test and doesn't try to dequeue the freshly queued entry. At the very least a barrier would be needed there but we can't afford to free the stream while it's being queued. So there's no other solution than making sure that either __pendconn_unlink_prx() or pendconn_cond_unlink() get the entry but never both, which is why the lock is required around the test. A possible solution would be to set p->target before unlinking the entry and using it to complete the test. This would leave no dead period where the pendconn is not seen as attached. It is possible, yet extremely difficult, to reproduce this bug, which was first noticed in bug #880. Running 100 servers with maxconn 1 and maxqueue 1 on leastconn and a connect timeout of 30ms under 16 threads with DEBUG_UAF, with a traffic making the backend's queue oscillate around zero (typically using 250 connections with a local httpterm server) may rarely manage to trigger a use-after-free. No backport is needed.	2020-10-23 09:21:55 +02:00
Willy Tarreau	b7ba1d9011	OPTIM: queue: don't call pendconn_unlink() when the pendconn is not queued On connection error processing, we can see massive storms of calls to pendconn_cond_unlink() to release a possible place in the queue. For example, in issue #908, on average half of the threads are caught in this function via back_try_conn_req() consecutive to a synchronous error. However we wait until grabbing the lock to know if the pendconn is effectively in a queue, which is expensive for many cases. We know the transition may only happen from in-queue to out-of-queue so it's safe to first run a preliminary check to see if it's worth going further. This will allow to avoid the cost of locking for most requests. This should not change anything for those completing correctly as they're already run through pendconn_free() which doesn't call pendconn_cond_unlink() unless deemed necessary.	2020-10-22 17:32:28 +02:00
Willy Tarreau	ac66d6bafb	MINOR: proxy; replace the spinlock with an rwlock This is an anticipation of finer grained locking for the queues. For now all lock places take a write lock so that there is no difference at all with previous code.	2020-10-22 17:32:28 +02:00
Willy Tarreau	de785f04e1	MINOR: threads/debug: only report lock stats for used operations In addition to the previous simplification, most locks don't use the seek or read lock (e.g. spinlocks etc) so let's split the dump into distinct operations (write/seek/read) and only report those which were used. Now the output size is roughly divided by 5 compared to previous ones.	2020-10-22 17:32:28 +02:00
Willy Tarreau	23d3b00bdd	MINOR: threads/debug: only report used lock stats The lock stats are very verbose and more than half of them are used in a typical test, making it hard to spot the sought values. Let's simply report "not used" for those which have not been called at all.	2020-10-22 17:32:28 +02:00
Christopher Faulet	d6c48366b8	BUG/MINOR: http-ana: Don't send payload for internal responses to HEAD requests When an internal response is returned to a client, the message payload must be skipped if it is a reply to a HEAD request. The payload is removed from the HTX message just before the message forwarding. This bugs has been around for a long time. It was already there in the pre-HTX versions. In legacy HTTP mode, internal errors are not parsed. So this bug cannot be easily fixed. Thus, this patch should only be backported in all HTX versions, as far as 2.0. However, the code has significantly changed in the 2.2. Thus in the 2.1 and 2.0, the patch must be entirely reworked.	2020-10-22 17:13:22 +02:00
Remi Tricot-Le Breton	6cb10384a3	MEDIUM: cache: Add support for 'If-None-Match' request header Partial support of conditional HTTP requests. This commit adds the support of the 'If-None-Match' header (see RFC 7232#3.2). When a client specifies a list of ETags through one or more 'If-None-Match' headers, they are all compared to the one that might have been stored in the corresponding http cache entry until one of them matches. If a match happens, a specific "304 Not Modified" response is sent instead of the cached data. This response has all the stored headers but no other data (see RFC 7232#4.1). Otherwise, the whole cached data is sent. Although unlikely in a GET/HEAD request, the "If-None-Match: *" syntax is valid and also receives a "304 Not Modified" response (RFC 7434#4.3.2). This resolves a part of GitHub issue #821.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	bcced09b91	MINOR: http: Add etag comparison function Add a function that compares two etags that might be of different types. If any of them is weak, the 'W/' prefix is discarded and a strict string comparison is performed. Co-authored-by: Tim Duesterhus <tim@bastelstu.be>	2020-10-22 16:06:20 +02:00
Tim Duesterhus	2493ee81d4	MINOR: http: Add `enum etag_type http_get_etag_type(const struct ist)` http_get_etag_type returns whether a given `etag` is a strong, weak, or invalid ETag.	2020-10-22 16:02:29 +02:00
William Lallemand	8e8581e242	MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension In issue #785, users are reporting that it's not convenient to load a ".crt.key" when the configuration contains a ".crt". This option allows to remove the extension of the certificate before trying to load any extra SSL file (.key, .ocsp, .sctl, .issuer etc.) The patch changes a little bit the way ssl_sock_load_files_into_ckch() looks for the file.	2020-10-20 18:25:46 +02:00
Christopher Faulet	96ddc8ab43	BUG/MEDIUM: connection: Never cleanup server lists when freeing private conns When a connection is released, depending on its state, it may be detached from the session and it may be removed from the server lists. The first case may happen for private or unsharable active connections. The second one should only be performed for idle or available connections. We never try to remove a connection from the server list if it is attached to a session. But it is also important to never try to remove a private connecion from the server lists, even if it is not attached to a session. Otherwise, the curr_used_conn server counter is decremented once too often. This bug was introduced by the commit `04a24c5ea` ("MINOR: connection: don't check priv flag on free"). It is related to the issue #881. It only affects the 2.3, no backport is needed.	2020-10-19 17:19:10 +02:00
Willy Tarreau	69a7b8fc6c	CLEANUP: task: remove the unused and mishandled global_rqueue_size This counter is only updated and never used, and in addition it's done without any atomicity so it's very unlikely to be correct on multi-CPU systems! Let's just remove it since it's not used.	2020-10-19 14:08:13 +02:00
Willy Tarreau	e72a3f4489	CLEANUP: tree-wide: reorder a few structures to plug some holes around locks A few structures were slightly rearranged in order to plug some holes left around the locks. Sizes ranging from 8 to 32 bytes could be saved depending on the structures. No performance difference was noticed (none was expected there), though memory usage might be slightly reduced in some rare cases.	2020-10-19 14:08:13 +02:00
Willy Tarreau	8f1f177ed0	MINOR: threads: change lock_t to an unsigned int We don't need to waste the size of a long for the locks: with the plocks, even an unsigned short would offer enough room for up to 126 threads! Let's use an unsigned int which will be easier to place in certain structures and will more conveniently plug some holes, and Atomic ops are at least as fast on 32-bit as on 64-bit. This will not change anything for 32-bit platforms.	2020-10-19 14:08:13 +02:00
Willy Tarreau	3d18498645	CLEANUP: threads: don't register an initcall when not debugging It's a bit overkill to register an initcall to call a function to set a lock to zero when not debugging, let's just declare the lock as pre-initialized to zero.	2020-10-19 14:08:13 +02:00
Ilya Shipitsin	fcb69d768b	BUILD: ssl: make BoringSSL use its own version numbers BoringSSL is a fork of OpenSSL 1.1.0, however in 49e9f67d8b7cbeb3953b5548ad1009d15947a523 it has changed version to 1.1.1. Should fix issue #895. This must be backported to 2.2, 2.1, 2.0, 1.8	2020-10-19 11:34:37 +02:00
Willy Tarreau	cd10def825	MINOR: backend: replace the lbprm lock with an rwlock It was previously a spinlock, and it happens that a number of LB algos only lock it for lookups, without performing any modification. Let's first turn it to an rwlock and w-lock it everywhere. This is strictly identical. It was carefully checked that every HA_SPIN_LOCK() was turned to HA_RWLOCK_WRLOCK() and that HA_SPIN_UNLOCK() was turned to HA_RWLOCK_WRUNLOCK() on this lock. _INIT and _DESTROY were updated too.	2020-10-17 18:51:41 +02:00
Willy Tarreau	61f799b8da	MINOR: threads: add the transitions to/from the seek state Since our locks are based on progressive locks, we support the upgradable seek lock that is compatible with readers and upgradable to a write lock. The main purpose is to take it while seeking down a tree for modification while other threads may seek the same tree for an input (e.g. compute the next event date). The newly supported operations are: HA_RWLOCK_SKLOCK(lbl,l) pl_take_s(l) /* N --> S / HA_RWLOCK_SKTOWR(lbl,l) pl_stow(l) / S --> W / HA_RWLOCK_WRTOSK(lbl,l) pl_wtos(l) / W --> S / HA_RWLOCK_SKTORD(lbl,l) pl_stor(l) / S --> R / HA_RWLOCK_WRTORD(lbl,l) pl_wtor(l) / W --> R / HA_RWLOCK_SKUNLOCK(lbl,l) pl_drop_s(l) / S --> N / HA_RWLOCK_TRYSKLOCK(lbl,l) (!pl_try_s(l)) / N -?> S / HA_RWLOCK_TRYRDTOSK(lbl,l) (!pl_try_rtos(l)) / R -?> S */ Existing code paths are left unaffected so this patch doesn't affect any running code.	2020-10-16 16:53:46 +02:00
Willy Tarreau	8d5360ca7f	MINOR: threads: augment rwlock debugging stats to report seek lock stats We currently use only read and write lock operations with rwlocks, but ours also support upgradable seek locks for which we do not report any stats. Let's add them now when DEBUG_THREAD is enabled.	2020-10-16 16:51:49 +02:00
Willy Tarreau	233ad288cd	CLEANUP: protocol: remove the now unused <handler> field of proto_fam->bind() We don't need to specify the handler anymore since it's set in the receiver. Let's remove this argument from the function and clean up the remains of code that were still setting it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	a74cb38e7c	MINOR: protocol: register the receiver's I/O handler and not the protocol's Now we define a new sock_accept_iocb() for socket-based stream protocols and use it as a wrapper for listener_accept() which now takes a listener and not an FD anymore. This will allow the receiver's I/O cb to be redefined during registration, and more specifically to get rid of the hard-coded hacks in protocol_bind_all() made for syslog. The previous ->accept() callback in the protocol was removed since it doesn't have anything to do with accept() anymore but is more generic. A few places where listener_accept() was compared against the FD's IO callback for debugging purposes on the CLI were updated.	2020-10-15 21:47:56 +02:00
Willy Tarreau	d2fb99f9d5	MINOR: protocol: add a default I/O callback and put it into the receiver For now we're still using the protocol's default accept() function as the I/O callback registered by the receiver into the poller. While this is usable for most TCP connections where a listener is needed, this is not suitable for UDP where a different handler is needed. Let's make this configurable in the receiver just like the upper layer is configurable for listeners. In order to ease stream protocols handling, the protocols will now provide a default I/O callback which will be preset into the receivers upon allocation so that almost none of them has to deal with it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	f1dc9f2f17	MINOR: sock: implement sock_accept_conn() to accept a connection The socket-specific accept() code in listener_accept() has nothing to do there. Let's move it to sock.c where it can be significantly cleaned up. It will now directly return an accepted connection and provide a status code instead of letting listener_accept() deal with various errno values. Note that this doesn't support the sockpair specific code. The function is now responsible for dealing with its own receiver's polling state and calling fd_cant_recv() when facing EAGAIN. One tiny change from the previous implementation is that the connection's sockaddr is now allocated before trying accept(), which saves a memcpy() of the resulting address for each accept at the expense of a cheap pool_alloc/pool_free on the final accept returning EAGAIN. This still apparently slightly improves accept performance in microbencharks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	1e509a7231	MINOR: protocol: add a new function accept_conn() This per-protocol function will be used to accept an incoming connection and return it as a struct connection*. As such the protocol stack's internal representation of a connection will not need to be handled by the listener code.	2020-10-15 21:47:56 +02:00
Willy Tarreau	7d053e4211	MINOR: sock: rename sock_accept_conn() to sock_accepting_conn() This call was introduced by commit `5ced3e887` ("MINOR: sock: add sock_accept_conn() to test a listening socket") but is actually quite confusing because it makes one think the socket will accept a connection (which is what we want to have in a new function) while it only tells whether it's configured to accept connections. Let's call it sock_accepting_conn() instead. The same change was applied to sockpair which had the same issue.	2020-10-15 21:47:56 +02:00
Willy Tarreau	65ed143841	MINOR: connection: add new error codes for accept_conn() accept_conn() will be used to accept an incoming connection and return it. It will have to deal with various error codes. The currently identified ones were created as CO_AC_*.	2020-10-15 21:47:56 +02:00
Willy Tarreau	83efc320aa	MEDIUM: listener: allocate the connection before queuing a new connection Till now we would keep a per-thread queue of pending incoming connections for which we would store: - the listener - the accepted FD - the source address - the source address' length And these elements were first used in session_accept_fd() running on the target thread to allocate a connection and duplicate them again. Doing this induces various problems. The first one is that session_accept_fd() may only run on file descriptors and cannot be reused for QUIC. The second issue is that it induces lots of memory copies and that the listerner queue thrashes a lot of cache, consuming 64 bytes per entry. This patch changes this by allocating the connection before queueing it, and by only placing the connection's pointer into the queue. Indeed, the first two calls used to initialize the connection already store all the information above, which can be retrieved from the connection pointer alone. So we just have to pop one pointer from the target thread, and pass it to session_accept_fd() which only needs the FD for the final settings. This starts to make the accept path a bit more transport-agnostic, and saves memory and CPU cycles at the same time (1% connection rate increase was noticed with 4 threads). Thanks to dividing the accept-queue entry size from 64 to 8 bytes, its size could be increased from 256 to 1024 connections while still dividing the overall size by two. No single queue full condition was met. One minor drawback is that connection may be allocated from one thread's pool to be used into another one. But this already happens a lot with connection reuse so there is really nothing new here.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9b7587a6af	MINOR: connection: make sockaddr_alloc() take the address to be copied Roughly half of the calls to sockadr_alloc() are made to copy an already known address. Let's optionally pass it in argument so that the function can handle the copy at the same time, this slightly simplifies its usage.	2020-10-15 21:47:56 +02:00
Willy Tarreau	0138f51f93	CLEANUP: fd: finally get rid of fd_done_recv() fd_done_recv() used to be useful with the FD cache because it used to allow to keep a file descriptor active in the poller without being marked as ready in the cache, saving it from ringing immediately, without incurring any system call. It was a way to make it yield to wait for new events leaving a bit of time for others. The only user left was the connection accepter (listen_accept()). We used to suspect that with the FD cache removal it had become totally useless since changing its readiness or not wouldn't change its status regarding the poller itself, which would be the only one deciding to report it again. Careful tests showed that it indeed has exactly zero effect nowadays, the syscall numbers are exactly the same with and without, including when enabling edge-triggered polling. Given that there's no more API available to manipulate it and that it was directly called as an optimization from listener_accept(), it's about time to remove it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e53e7ec9d9	CLEANUP: protocol: remove the ->drain() function No protocol defines it anymore. The last user used to be the monitor-net stuff that got partially broken already when the tcp_drain() function moved to conn_sock_drain() with commit `e215bba95` ("MINOR: connection: make conn_sock_drain() work for all socket families") in 1.9-dev2. A part of this will surely move back later when non-socket connections arrive with QUIC but better keep the API clean and implement what's needed in time instead.	2020-10-15 21:47:04 +02:00
Willy Tarreau	9e9919dd8b	MEDIUM: proxy: remove obsolete "monitor-net" As discussed here during 2.1-dev, "monitor-net" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200 if ..." and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Willy Tarreau	77e0daef9f	MEDIUM: proxy: remove obsolete "mode health" As discussed here during 2.1-dev, "mode health" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, doesn't support source filtering, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200" and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Amaury Denoyelle	04a24c5eaa	MINOR: connection: don't check priv flag on free Do not check CO_FL_PRIVATE flag to check if the connection is in session list on conn_free. This is necessary due to the future patches which add server connections in the session list even if not private, if the mux protocol is the subject of HOL blocking.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	3d3c0918dc	MINOR: mux/connection: add a new mux flag for HOL risk This flag is used to indicate if the mux protocol is subject to head-of-line blocking problem.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	c98df5fb44	MINOR: connection: improve list api usage Replace !LIST_ISEMPTY by LIST_ADDED and LIST_DEL+LIST_INIT by LIST_DEL_INIT for connection session list.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	9c13b62b47	BUG/MEDIUM: connection: fix srv idle count on conn takeover On server connection migration from one thread to another, the wrong idle thread-specific counter is decremented. This bug was introduced since commit `3d52f0f1f8` due to the factorization with srv_use_idle_conn. However, this statement is only executed from conn_backend_get. Extract the decrement from srv_use_idle_conn in conn_backend_get and use the correct thread-specific counter. Rename the function to srv_use_conn to better reflect its purpose as it is also used with a newly initialized connection not in the idle list. As a side change, the connection insertion to available list has also been extracted to conn_backend_get. This will be useful to be able to specify an alternative list for protocol subject to HOL risk that should not be shared between several clients. This bug is only present in this release and thus do not need a backport.	2020-10-15 15:19:34 +02:00
Willy Tarreau	29185140db	MINOR: protocol: make proto_tcp & proto_uxst report listening sockets Now we introdce a new .rx_listening() function to report if a receiver is actually a listening socket. The reason for this is to help detect shared sockets that might have been broken by sibling processes.	2020-10-13 18:15:33 +02:00
Willy Tarreau	5ced3e8879	MINOR: sock: add sock_accept_conn() to test a listening socket At several places we need to check if a socket is still valid and still willing to accept connections. Instead of open-coding this, each time, let's add a new function for this.	2020-10-13 18:15:33 +02:00
Fr�d�ric L�caille	3fc0fe05fd	MINOR: peers: heartbeat, collisions and handshake information for "show peers" command. This patch adds "coll" new counter and the heartbeat timer values to "show peers" command. It also adds the elapsed time since the last handshake to new "last_hdshk" new peer dump field.	2020-10-09 20:59:58 +02:00
Willy Tarreau	e03204c8e1	MEDIUM: listeners: implement protocol level ->suspend/resume() calls Now we have ->suspend() and ->resume() for listeners at the protocol level. This means that it now becomes possible for a protocol to redefine its own way to suspend and resume. The default functions are provided for TCP, UDP and unix, and they are pass-through to the receiver equivalent as it used to be till now. Nothing was defined for sockpair since it does not need to suspend/resume during reloads, hence it will succeed.	2020-10-09 18:44:37 +02:00
Willy Tarreau	7b2febde1d	MINOR: listeners: split do_unbind_listener() in two The inner part now goes into the protocol and is used to decide how to unbind a given protocol's listener. The existing code which is able to also unbind the receiver was provided as a default function that we currently use everywhere. Some complex listeners like QUIC will use this to decide how to unbind without impacting existing connections, possibly by setting up other incoming paths for the traffic.	2020-10-09 18:44:37 +02:00
Willy Tarreau	f58b8db47b	MEDIUM: receivers: add an rx_unbind() method in the protocols This is used as a generic way to unbind a receiver at the end of do_unbind_listener(). This allows to considerably simplify that function since we can now let the protocol perform the cleanup. The generic code was moved to sock.c, along with the conditional rx_disable() call. Now the code also supports that the ->disable() function of the protocol which acts on the listener performs the close itself and adjusts the RX_F_BUOND flag accordingly.	2020-10-09 18:44:36 +02:00
Willy Tarreau	18c20d28d7	MINOR: listeners: move the LI_O_MWORKER flag to the receiver This listener flag indicates whether the receiver part of the listener is specific to the master or to the workers. In practice it's only used by the master's CLI right now. It's used to know whether or not the FD must be closed before forking the workers. For this reason it's way more of a receiver's property than a listener's property, so let's move it there under the name RX_F_MWORKER. The rest of the code remains unchanged.	2020-10-09 18:43:05 +02:00
Willy Tarreau	75c98d166e	CLEANUP: listeners: remove the do_close argument to unbind_listener() And also remove it from its callers. This subtle distinction was added as sort of a hack for the seamless reload feature but is not needed anymore since the do_close turned unused since commit previous commit ("MEDIUM: listener: let do_unbind_listener() decide whether to close or not"). This also removes the unbind_listener_no_close() function.	2020-10-09 18:41:56 +02:00
Willy Tarreau	02e8557e88	MINOR: protocol: add protocol_stop_now() to instant-stop listeners This will instantly stop all listeners except those which belong to a proxy configured with a grace time. This means that UDP listeners, and peers will also be stopped when called this way.	2020-10-09 18:29:04 +02:00
Willy Tarreau	acde152175	MEDIUM: proxy: centralize proxy status update and reporting There are multiple ways a proxy may switch to the disabled state, but now it's essentially once it loses its last listener. Instead of keeping duplicate code around and reporting the state change before actually seeing it, we now report it at the moment it's performed (from the last listener leaving) which allows to remove the message from all other places.	2020-10-09 18:29:04 +02:00
Willy Tarreau	a389c9e1e3	MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends For now we cannot easily distinguish a peers frontend from another one, which will be problematic to avoid reporting them when stopping their listeners. Let's add PR_MODE_PEERS for this. It's not supposed to cause any issue since all non-HTTP proxies are handled similarly now.	2020-10-09 18:28:21 +02:00
Willy Tarreau	caa7df1296	MINOR: listeners: add a new stop_listener() function This function will be used to definitely stop a listener (e.g. during a soft_stop). This is actually tricky because it may be called for a proxy or for a protocol, both of which require locks and already hold some. The function takes booleans indicating which ones are already held, hoping this will be enough. It's not well defined wether proto->disable() and proto->rx_disable() are supposed to be called with any lock held, and they are used from do_unbind_listener() with all these locks. Some back annotations ought to be added on this point. The proxy's listeners count is updated, and the proxy is marked as disabled and woken up after the last one is gone. Note that a listener in listen state is already not attached anymore since it was disabled.	2020-10-09 18:27:48 +02:00
Willy Tarreau	b4c083f5bf	MINOR: listeners: split delete_listener() in two versions We'll need an already locked variant of this function so let's make __delete_listener() which will be called with the protocol lock held and the listener's lock held.	2020-10-09 11:27:30 +02:00
Willy Tarreau	5ddf1ce9c4	MINOR: protocol: add a new pair of enable/disable methods for listeners These methods will be used to enable/disable accepting new connections so that listeners do not play with FD directly anymore. Since all the currently supported protocols work on socket for now, these are identical to the rx_enable/rx_disable functions. However they were not defined in sock.c since it's likely that some will quickly start to differ. At the moment they're not used. We have to take care of fd_updt before calling fd_{want,stop}_recv() because it's allocated fairly late in the boot process and some such functions may be called very early (e.g. to stop a disabled frontend's listeners).	2020-10-09 11:27:30 +02:00
Willy Tarreau	686fa3db50	MINOR: protocol: add a new pair of rx_enable/rx_disable methods These methods will be used to enable/disable rx at the receiver level so that callers don't play with FDs directly anymore. All our protocols use the generic ones from sock.c at the moment. For now they're not used.	2020-10-09 11:27:30 +02:00
Willy Tarreau	e70c7977f2	MINOR: sock: provide a set of generic enable/disable functions These will be used on receivers, to enable or disable receiving on a listener, which most of the time just consists in enabling/disabling the file descriptor. We have to take care of the existence of fd_updt to know if we may or not call fd_{want,stop}_recv() since it's not permitted in very early boot.	2020-10-09 11:27:30 +02:00
Willy Tarreau	58e6b71bb0	MINOR: protocol: implement an ->rx_resume() method This one undoes ->rx_suspend(), it tries to restore an operational socket. It was only implemented for TCP since it's the only one we support right now.	2020-10-09 11:27:30 +02:00
Willy Tarreau	cb66ea60cf	MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver) The ->pause method is inappropriate since it doesn't exactly "pause" a listener but rather temporarily disables it so that it's not visible at all to let another process take its place. The term "suspend" is more suitable, since the "pause" is actually what we'll need to apply to the FULL and LIMITED states which really need to make a pause in the accept process. And it goes well with the use of the "resume" function that will also need to be made per-protocol. Let's rename the function and make it act on the receiver since it's already what it essentially does, hence the prefix "_rx" to make it more explicit. The protocol struct was a bit reordered because it was becoming a real mess between the parts related to the listeners and those for the receivers.	2020-10-09 11:27:30 +02:00
Willy Tarreau	d7f331c8b8	MINOR: protocol: rename the ->listeners field to ->receivers Since the listeners were split into receiver+listener, this field ought to have been renamed because it's confusing. It really links receivers and not listeners, as most of the time it's used via rx.proto_list! The nb_listeners field was updated accordingly.	2020-10-09 11:27:30 +02:00
Willy Tarreau	dae0692717	CLEANUP: listeners: remove the now unused enable_all_listeners() It's not used anymore since previous commit. The good thing is that no more listener function now directly acts on a protocol.	2020-10-09 11:27:30 +02:00
Willy Tarreau	078e1c7102	CLEANUP: protocol: remove the ->enable_all method It's not used anymore, now the listeners are enabled from protocol_enable_all().	2020-10-09 11:27:30 +02:00
Willy Tarreau	7834a3f70f	MINOR: listeners: export enable_listener() we'll soon call it from outside.	2020-10-09 11:27:30 +02:00
Willy Tarreau	d008009958	CLEANUP: listeners: remove unused disable_listener and disable_all_listeners These ones have never been called, they were referenced by the protocol's disable_all for some protocols but there are no traces of their use, so in addition to not being sure the code works, it has never been tested. Let's remove a bit of complexity starting from there.	2020-10-09 11:27:30 +02:00
Willy Tarreau	fb4ead8e8a	CLEANUP: protocol: remove the ->disable_all method This one has never been used, is only referenced by proto_uxst and proto_sockpair, and it's not even certain it works at all. Let's get rid of it.	2020-10-09 11:27:30 +02:00
Willy Tarreau	1accacbcc3	CLEANUP: proxy: remove the now unused pause_proxies() and resume_proxies() They're not used anymore, delete them before someone thinks about using them again!	2020-10-09 11:27:30 +02:00
Willy Tarreau	09819d1118	MINOR: protocol: introduce protocol_{pause,resume}_all() These two functions are used to pause and resume all listeners of all protocols. They use the standard listener functions for this so they're supposed to handle the situation gracefully regardless of the upper proxies' states, and they will report completion on proxies once the switch is performed. It might be nice to define a particular "failed" state for listeners that cannot resume and to count them on proxies in order to mention that they're definitely stuck. On the other hand, the current situation is retryable which is quite appreciable as well.	2020-10-09 11:27:30 +02:00
Willy Tarreau	337c835d16	MEDIUM: proxy: merge zombify_proxy() with stop_proxy() The two functions don't need to be distinguished anymore since they have all the necessary info to act as needed on their listeners. Let's just pass via stop_proxy() and make it check for each listener which one to close or not.	2020-10-09 11:27:30 +02:00
Willy Tarreau	43ba3cf2b5	MEDIUM: proxy: remove start_proxies() Its sole remaining purpose was to display "proxy foo started", which has little benefit and pollutes output for those with plenty of proxies. Let's remove it now. The VTCs were updated to reflect this, because many of them had explicit counts of dropped lines to match this message. This is tagged as MEDIUM because some users may be surprized by the loss of this quite old message.	2020-10-09 11:27:30 +02:00
Willy Tarreau	c3914d4fff	MEDIUM: proxy: replace proxy->state with proxy->disabled The remaining proxy states were only used to distinguish an enabled proxy from a disabled one. Due to the initialization order, both PR_STNEW and PR_STREADY were equivalent after startup, and they would only differ from PR_STSTOPPED when the proxy is disabled or shutdown (which is effectively another way to disable it). Now we just have a "disabled" field which allows to distinguish them. It's becoming obvious that start_proxies() is only used to print a greeting message now, that we'd rather get rid of. Probably that zombify_proxy() and stop_proxy() should be merged once their differences move to the right place.	2020-10-09 11:27:30 +02:00
Willy Tarreau	1ad64acf6c	CLEANUP: peers: don't use the PR_ST* states to mark enabled/disabled The enabled/disabled config options were stored into a "state" field that is an integer but contained only PR_STNEW or PR_STSTOPPED, which is a bit confusing, and causes a dependency with proxies. This was renamed to "disabled" and is used as a boolean. The field was also moved to the end of the struct to stop creating a hole and fill another one.	2020-10-09 11:27:30 +02:00
Willy Tarreau	f18d968830	MEDIUM: proxy: remove state PR_STPAUSED This state was used to mention that a proxy was in PAUSED state, as opposed to the READY state. This was causing some trouble because if a listener failed to resume (e.g. because its port was temporarily in use during the resume), it was not possible to retry the operation later. Now by checking the number of READY or PAUSED listeners instead, we can accurately know if something went bad and try to fix it again later. The case of the temporary port conflict during resume now works well: $ socat readline /tmp/sock1 prompt > disable frontend testme3 > disable frontend testme3 All sockets are already disabled. > enable frontend testme3 Failed to resume frontend, check logs for precise cause (port conflict?). > enable frontend testme3 > enable frontend testme3 All sockets are already enabled.	2020-10-09 11:27:30 +02:00
Willy Tarreau	a17c91b37f	MEDIUM: proxy: remove the PR_STERROR state This state is only set when a pause() fails but isn't even set when a resume() fails. And we cannot recover from this state. Instead, let's just count remaining ready listeners to decide to emit an error or not. It's more accurate and will better support new attempts if needed.	2020-10-09 11:27:30 +02:00
Willy Tarreau	6b3bf733dd	MEDIUM: proxy: remove the unused PR_STFULL state Since v1.4 or so, it's almost not possible anymore to set this state. The only exception is by using the CLI to change a frontend's maxconn setting below its current usage. This case makes no sense, and for other cases it doesn't make sense either because "full" is a vague concept when only certain listeners are full and not all. Let's just remove this unused state and make it clear that it's not reported. The "ready" or "open" states will continue to be reported without being misleading as they will be opposed to "stop".	2020-10-09 11:27:30 +02:00
Willy Tarreau	efc0eec4c1	MINOR: proxy: maintain per-state counters of listeners The proxy state tries to be synthetic but that doesn't work well with many listeners, especially for transition phases or after a failed pause/resume. In order to address this, we'll instead rely on counters of listeners in a given state for the 3 major states (ready, paused, listen) and a total counter. We'll now be able to determine a proxy's state by comparing these counters only.	2020-10-09 11:27:30 +02:00
Willy Tarreau	a37b244509	MINOR: listeners: introduce listener_set_state() This function is used as a wrapper to set a listener's state everywhere. We'll use it later to maintain some counters in a consistent state when switching state so it's capital that all state changes go through it. No functional change was made beyond calling the wrapper.	2020-10-09 11:27:30 +02:00
Willy Tarreau	c6dac6c7f5	MEDIUM: listeners: remove the now unused ZOMBIE state The zombie state is not used anymore by the listeners, because in the last two cases where it was tested it couldn't match as it was covered by the test on the process mask. Instead now the FD is either in the LISTEN state or the INIT state. This also avoids forcing the listener to be single-dimensional because actually belonging to another process isn't totally exclusive with the other states, which explains some of the difficulties requiring to check the proc_mask and the fd sometimes. So let's get rid of it now not to be tempted to reuse it. The doc on the listeners state was updated.	2020-10-09 11:27:29 +02:00
Emeric Brun	b0c331f71f	BUG/MINOR: proxy/log: frontend/backend and log forward names must differ This patch disallow to use same name for a log forward section and a frontend/backend section.	2020-10-08 08:53:26 +02:00
Emeric Brun	6d75616951	MINOR: channel: new getword and getchar functions on channel. This patch adds two new functions to get a char or a word from a channel.	2020-10-07 17:17:27 +02:00
Emeric Brun	2897644ae5	MINOR: stats: inc req counter on listeners. This patch enables count of requests for listeners if listener's counters are enabled.	2020-10-07 17:17:27 +02:00
Amaury Denoyelle	fbd0bc98fe	MINOR: dns/stats: integrate dns counters in stats Use the new stats module API to integrate the dns counters in the standard stats. This is done in order to avoid code duplication, keep the code related to cli out of dns and use the full possibility of the stats function, allowing to print dns stats in csv or json format.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	0b70a8a314	MINOR: stats: add config "stats show modules" By default, hide the extra statistics on the html page. Define a new flag STAT_SHMODULES which is activated if the config "stats show modules" is set.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	d3700a7fda	MINOR: stats: support clear counters for dynamic stats Add a boolean 'clearable' on stats module structure. If set, it forces all the counters to be reset on 'clear counters' cli command. If not, the counters are reset only when 'clear counters all' is used.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	ee63d4bd67	MEDIUM: stats: integrate static proxies stats in new stats This is executed on startup with the registered statistics module. The existing statistics have been merged in a list containing all statistics for each domain. This is useful to print all available statistics in a generic way. Allocate extra counters for all proxies/servers/listeners instances. These counters are allocated with the counters from the stats modules registered on startup.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	730c727ea3	MEDIUM: stats: add abstract type to store counters Implement a small API to easily add extra counters inside a structure instance. This will be used to implement dynamic statistics linked on every type of object as needed. The counters are stored in a dynamic array inside the relevant objects.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	58d395e0d6	MEDIUM: stats: define an API to register stat modules A stat module can be registered to quickly add new statistics on haproxy. It must be attached to one of the available stats domain. The register must be done using INITCALL on STG_REGISTER. The stat module has a name which should be unique for each new module in a domain. It also contains a statistics list with their name/desc and a pointer to a function used to fill the stats from the module counters. The module also provides the initial counters values used on automatically allocated counters. The offset for these counters are stored in the module structure.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	72b16e5173	MINOR: stats: define additional flag px cap on domain This flag can be used to determine on what type of proxy object the statistics should be relevant. It will be useful when adding dynamic statistics. Currently, this flag is not used.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	072f97eddf	MINOR: stats: define the concept of domain for statistics The domain option will be used to have statistics attached to other objects than proxies/listeners/servers. At the moment, only the PROXY domain is available. Add an argument 'domain' on the 'show stats' cli command to specify the domain. Only 'domain proxy' is available now. If not specified, proxy will be considered the default domain. For HTML output, only proxy statistics will be displayed.	2020-10-05 12:02:14 +02:00
Amaury Denoyelle	da5b6d1cd9	MINOR: stats: hide px/sv/li fields in applet struct Use an opaque pointer to store proxy instance. Regroup server/listener as a single opaque pointer. This has the benefit to render the structure more evolutive to support statistics on other types of objects in the future. This patch is needed to extend stat support for components other than proxies objects. The prometheus module has been adapted for these changes.	2020-10-05 10:48:58 +02:00
Amaury Denoyelle	97323c9ed4	MINOR: stats: add stats size as a parameter for csv/json dump Render the stats size parametric in csv/json dump functions. This is needed for the future patch which provides dynamic stats. For now the static value ST_F_TOTAL_FIELDS is provided. Remove unused parameter px on stats_dump_one_line. This patch is needed to extend stat support to components other than proxies objects.	2020-10-05 09:06:10 +02:00
Amaury Denoyelle	3ca927e68f	REORG: stats: export some functions Un-mark stats_dump_one_line and stats_putchk as static and export them in the header file. These functions will be reusable by other components to print their statistics. This patch is needed to extend stat support to components other than proxies objects.	2020-10-05 09:06:10 +02:00
Amaury Denoyelle	cd3de50779	MINOR: counters: fix a typo in comment Wrong copy/paste comment, replace listeners/frontends by servers/backends This may be backported up to 1.7.	2020-10-05 09:05:57 +02:00
Willy Tarreau	fac0f645df	BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe A crash reported in github issue #880 looks impossible unless pendconn_cond_unlink() occasionally sees a null leaf_p when attempting to remove an entry, which seems to be confirmed by the reporter. What seems to be happening is that depending on compiler optimizations, this pointer can appear as null while pointers are moved if one of the node's parents is removed from or inserted into the tree. There's no explicit null of the pointer during these operations but those pointers are rewritten in multiple steps and nothing prevents this situation from happening, and there are no particular barrier nor atomic ops around this. This test was used to avoid unnecessary locking, for already deleted entries, but looking at the code it appears that pendconn_free() already resets s->pend_pos that's used as <p> there, and that the other call reasons are after an error where the connection will be dropped as well. So we don't save anything by doing this test, and make it unsafe. The older code used to check for list emptiness there and not inside pendconn_unlink(), which explains why the code has stayed there. Let's just remove this now. Thanks to @jaroslawr for reporting this issue in great details and for testing the proposed fix. This should be backpored to 1.8, where the test on LIST_ISEMPTY should be moved to pendconn_unlink() instead (inside the lock, just like 2.0+).	2020-10-02 18:10:26 +02:00
Amaury Denoyelle	fa41cb6792	MINOR: tools: support for word expansion of environment in parse_line Allow the syntax "${...[*]}" to expand an environment variable containing several values separated by spaces as individual arguments. A new flag PARSE_OPT_WORD_EXPAND has been added to toggle this feature on parse_line invocation. In case of an invalid syntax, a new error PARSE_ERR_WRONG_EXPAND will be triggered. This feature has been asked on the github issue #165.	2020-10-01 17:24:14 +02:00
Willy Tarreau	3ca2365904	BUG/MEDIUM: h2: report frame bits only for handled types As part of his GREASE experiments on Chromium, Bence B�ky reported in https://lists.w3.org/Archives/Public/ietf-http-wg/2020JulSep/0202.html and https://bugs.chromium.org/p/chromium/issues/detail?id=1127060 that a certain combination of frame type and frame flags was causing an error on app.slack.com. It turns out that it's haproxy that is causing this issue because the frame type is wrongly assumed to support padding, the frame flags indicate padding is present, and the frame is too short for this, resulting in an error. The reason why only some frame types are affected is due to the frame type being used in a bit shift to match against a mask, and where the 5 lower bits of the frame type only are used to compute the frame bit. If the resulting frame bit matches a DATA, HEADERS or PUSH_PROMISE frame bit, then padding support is assumed and the test is enforced, resulting in a PROTOCOL_ERROR or FRAME_SIZE_ERROR depending on the payload size. We must never match any such bit for unsupported frame types so let's add a check for this. This must be backported as far as 1.8. Thanks to Cooper Bethea for providing enough context to help narrow the issue down and to Bence B�ky for creating a simple reproducer.	2020-09-18 08:05:03 +02:00
Willy Tarreau	2b5e0d8b6a	MEDIUM: proto_udp: replace last AF_CUST_UDP* with AF_INET* We don't need to cheat with the sock_domain anymore, we now always have the SOCK_DGRAM sock_type as a complementary selector. This patch restores the sock_domain to AF_INET* in the udp* protocols and removes all traces of the now unused AF_CUST_*.	2020-09-16 22:08:08 +02:00
Willy Tarreau	910c64da96	MEDIUM: protocol: store the socket and control type in the protocol array The protocol array used to be only indexed by socket family, which is very problematic with UDP (requiring an extra family) and with the forthcoming QUIC (also requiring an extra family), especially since that binds them to certain families, prevents them from supporting dgram UNIX sockets etc. In order to address this, we now start to register the protocols with more info, namely the socket type and the control type (either stream or dgram). This is sufficient for the protocols we have to deal with, but could also be extended further if multiple protocol variants were needed. But as is, it still fits nicely in an array, which is convenient for lookups that are instant.	2020-09-16 22:08:08 +02:00
Willy Tarreau	a54553f74f	MINOR: protocol: add the control layer type in the protocol struct This one will be needed to more accurately select a protocol. It may differ from the socket type for QUIC, which uses dgram at the socket layer and provides stream at the control layer. The upper level requests a control layer only so we need this field.	2020-09-16 22:08:08 +02:00
Willy Tarreau	65ec4e3ff7	MEDIUM: tools: make str2sa_range() check that the protocol has ->connect() Most callers of str2sa_range() need the protocol only to check that it provides a ->connect() method. It used to be used to verify that it's a stream protocol, but it might be a bit early to get rid of it. Let's keep the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT is present. This way almost all call places could be cleaned from this. There's a strange test in the server address parsing code that rechecks the family from the socket which seems to be a duplicate of the previously removed tests. It will have to be rechecked.	2020-09-16 22:08:08 +02:00
Willy Tarreau	5fc9328aa2	MINOR: tools: make str2sa_range() directly return the protocol We'll need this so that it can return pointers to stacked protocol in the future (for QUIC). In addition this removes a lot of tests for protocol validity in the callers. Some of them were checked further apart, or after a call to str2listener() and they were simplified as well. There's still a trick, we can fail to return a protocol in case the caller accepts an fqdn for use later. This is what servers do and in this case it is valid to return no protocol. A typical example is: server foo localhost:1111	2020-09-16 22:08:08 +02:00
Willy Tarreau	9b3178df23	MINOR: listener: pass the chosen protocol to create_listeners() The function will need to use more than just a family, let's pass it the selected protocol. The caller will then be able to do all the fancy stuff required to pick the best protocol.	2020-09-16 22:08:08 +02:00
Willy Tarreau	aa333123f2	MINOR: cfgparse: add str2receiver() to parse dgram receivers This is at least temporary, as the migration at once is way too difficuly. For now it still creates listeners but only allows DGRAM sockets. This aims at easing the split between listeners and receivers.	2020-09-16 22:08:08 +02:00
Willy Tarreau	a93e5c7fae	MINOR: tools: make str2sa_range() optionally return the fd If a file descriptor was passed, we can optionally return it. This will be useful for listening sockets which are both a pre-bound FD and a ready socket.	2020-09-16 22:08:08 +02:00
Willy Tarreau	909c23b086	MINOR: listener: remove the inherited arg to create_listener() This argument can now safely be determined from fd != -1, let's just drop it.	2020-09-16 22:08:08 +02:00
Willy Tarreau	328199348b	MINOR: tools: add several PA_O_* flags in str2sa_range() callers These flags indicate whether the call is made to fill a bind or a server line, or even just send/recv calls (like logs or dns). Some special cases are made for outgoing FDs (e.g. pipes for logs) or socket FDs (e.g external listeners), and there's a distinction between stream or dgram usage that's expected to significantly help str2sa_range() proceed appropriately with the input information. For now they are not used yet.	2020-09-16 22:08:08 +02:00
Willy Tarreau	809587635e	MINOR: tools: add several PA_O_PORT_* flags in str2sa_range() callers These flags indicate what is expected regarding port specifications. Some callers accept none, some need fixed ports, some have it mandatory, some support ranges, and some take an offset. Each possibilty is reflected by an option. For now they are not exploited, but the goal is to instrument str2sa_range() to properly parse that.	2020-09-16 22:08:07 +02:00
Willy Tarreau	cd3a5591f6	MINOR: tools: make str2sa_range() take more options than just resolve We currently have an argument to require that the address is resolved but we'll soon add more, so let's turn it into a bit field. The old "resolve" boolean is now PA_O_RESOLVE.	2020-09-16 22:08:07 +02:00
Willy Tarreau	a5b325f92c	MINOR: protocol: add a real family for existing FDs At some places (log fd@XXX, bind fd@XXX) we support using an explicit file descriptor number, that is placed into the sockaddr for later use. The problem is that till now it was done with an AF_UNSPEC family, which is also used for other situations like missing info or rings (for logs). Let's create an "official" family AF_CUST_EXISTING_FD for this case so that we are certain the FD can be found in the address when it is set.	2020-09-16 22:08:07 +02:00
Willy Tarreau	1e984b73f0	CLEANUP: protocol: remove family-specific fields from struct protocol This removes the following fields from struct protocol that are now retrieved from the protocol family instead: .sock_family, .sock_addrlen, .l3_addrlen, .addrcmp, .bind, .get_src, .get_dst. This also removes the UDP-specific udp{,6}_get_{src,dst}() functions which were referenced but not used yet. Their goal was only to remap the original AF_INET* addresses to AF_CUST_UDP*. Note that .sock_domain is still there as it's used as a selector for the protocol struct to be used.	2020-09-16 22:08:07 +02:00
Willy Tarreau	f1f660978c	MINOR: protocol: retrieve the family-specific fields from the family We now take care of retrieving sock_family, l3_addrlen, bind(), addrcmp(), get_src() and get_dst() from the protocol family and not just the protocol itself. There are very few places, this was only seldom used. Interestingly in sock_inet.c used to rely on ->sock_family instead of ->sock_domain, and sock_unix.c used to hard-code PF_UNIX instead of using ->sock_domain. Also it appears obvious we have something wrong it the protocol selection algorithm because sock_domain is the one set to the custom protocols while it ought to be sock_family instead, which would avoid having to hard-code some conversions for UDP namely.	2020-09-16 22:08:07 +02:00
Willy Tarreau	b0254cb361	MINOR: protocol: add a new proto_fam structure for protocol families We need to specially handle protocol families which regroup common functions used for a given address family. These functions include bind(), addrcmp(), get_src() and get_dst() for now. Some fields are also added about the address family, socket domain (protocol family passed to the socket() syscall), and address length. These protocol families are referenced from the protocols but not yet used.	2020-09-16 22:08:07 +02:00
Willy Tarreau	62292b28a3	MEDIUM: sockpair: implement sockpair_bind_receiver() Note that for now we don't have a sockpair.c file to host that unusual family, so the new function was placed directly into proto_sockpair.c. It's no big deal given that this family is currently not shared with multiple protocols. The function does almost nothing but setting up the receiver. This is normal as the socket the FDs are passed onto are supposed to have been already created somewhere else, and the only usable identifier for such a socket pair is the receiving FD itself. The function was assigned to sockpair's ->bind() and is not used yet.	2020-09-16 22:08:07 +02:00
Willy Tarreau	1e0a860099	MEDIUM: sock_unix: implement sock_unix_bind_receiver() This function performs all the bind-related stuff for UNIX sockets that was previously done in uxst_bind_listener(). There is a very tiny difference however, which is that previously, in the unlikely event where listen() would fail, it was still possible to roll back the binding and rename the backup to the original socket. Now we have to rename it before calling returning, hence it will be done before calling listen(). However, this doesn't cover any particular use case since listen() has no reason to fail there (and the rollback is not done for inherited sockets), that was just done that way as a generic error processing path. The code is not used yet and is referenced in the uxst proto's ->bind().	2020-09-16 22:08:07 +02:00
Willy Tarreau	d69ce1ffbc	MEDIUM: sock_inet: implement sock_inet_bind_receiver() This function collects all the receiver-specific code from both tcp_bind_listener() and udp_bind_listener() in order to provide a more generic AF_INET/AF_INET6 socket binding function. For now the API is not very elegant because some info are still missing from the receiver while there's no ideal place to fill them except when calling ->listen() at the protocol level. It looks like some polishing code is needed in check_config_validity() or somewhere around this in order to finalize the receivers' setup. The main issue is that listeners and receivers are created before bind_conf options are parsed and that there's no finishing step to resolve some of them. The function currently sets up a receiver and subscribes it to the poller. In an ideal world we wouldn't subscribe it but let the caller do it after having finished to configure the L4 stuff. The problem is that the caller would then need to perform an fd_insert() call and to possibly set the exported flag on the FD while it's not its job. Maybe an improvement could be to have a separate sock_start_receiver() call in sock.c. For now the function is not used but it will soon be. It's already referenced as tcp and udp's ->bind().	2020-09-16 22:08:07 +02:00
Willy Tarreau	3e5c7ab7ce	MINOR: protocol: add a new ->bind() entry to bind the receiver This will be the function that must be used to bind the receiver. It solely depends on the address family but for now it's simpler to have it per protocol.	2020-09-16 22:08:07 +02:00
Willy Tarreau	b3580b19c8	MINOR: protocol: rename the ->bind field to ->listen The function currently is doing both the bind() and the listen(), so let's call it ->listen so that the bind() operation can move to another place.	2020-09-16 22:08:07 +02:00
Willy Tarreau	c049c0d5ad	MINOR: sock: make sock_find_compatible_fd() only take a receiver We don't need to have a listener anymore to find an fd, a receiver with its settings properly set is enough now.	2020-09-16 22:08:07 +02:00
Willy Tarreau	3fd3bdc836	MINOR: receiver: move the FOREIGN and V6ONLY options from listener to settings The new RX_O_FOREIGN, RX_O_V6ONLY and RX_O_V4V6 options are now set into the rx_settings part during the parsing, so that we don't need to adjust them in each and every listener anymore. We have to keep both v4v6 and v6only due to the precedence from v6only over v4v6.	2020-09-16 22:08:07 +02:00
Willy Tarreau	43046fa4f4	MINOR: listener: move the INHERITED flag down to the receiver It's the receiver's FD that's inherited from the parent process, not the listener's so the flag must move to the receiver so that appropriate actions can be taken.	2020-09-16 22:08:07 +02:00
Willy Tarreau	0b9150155e	MINOR: receiver: add a receiver-specific flag to indicate the socket is bound In order to split the receiver from the listener, we'll need to know that a socket is already bound and ready to receive. We used to do that via tha LI_O_ASSIGNED state but that's not sufficient anymore since the receiver might not belong to a listener anymore. The new RX_F_BOUND flag is used for this.	2020-09-16 22:08:07 +02:00
Willy Tarreau	eef454224d	MINOR: receiver: link the receiver to its owner A receiver will have to pass a context to be installed into the fdtab for use by the handler. We need to set this into the receiver struct as the bind will happen longer after the configuration.	2020-09-16 22:08:07 +02:00
Willy Tarreau	0fce6bce34	MINOR: receiver: link the receiver to its settings Just like listeners keep a pointer to their bind_conf, receivers now also have a pointer to their rx_settings. All those belonging to a listener are automatically initialized with a pointer to the bind_conf's settings.	2020-09-16 22:08:07 +02:00
Willy Tarreau	d45693d85c	REORG: listener: move the receiver part to a new file We'll soon add flags for the receivers, better add them to the final file, so it's time to move the definition to receiver-t.h. The struct receiver and rx_settings were placed there.	2020-09-16 22:08:07 +02:00
Willy Tarreau	b743661f04	REORG: listener: move the listener's proto to the receiver The receiver is the one which depends on the protocol while the listener relies on the receiver. Let's move the protocol there. Since there's also a list element to get back to the listener from the proto list, this list element (proto_list) was moved as well. For now when scanning protos, we still see listeners which are linked by their rx.proto_list part.	2020-09-16 22:08:05 +02:00
Willy Tarreau	38ba647f9f	REORG: listener: move the receiving FD to struct receiver The listening socket is represented by its file descriptor, which is generic to all receivers and not just listeners, so it must move to the rx struct. It's worth noting that in order to extend receivers and listeners to other protocols such as QUIC, we'll need other handles than file descriptors here, and that either a union or a cast to uintptr_t will have to be used. This was not done yet and the field was preserved under the name "fd" to avoid adding confusion.	2020-09-16 22:08:03 +02:00
Willy Tarreau	371590661e	REORG: listener: move the listening address to a struct receiver The address will be specific to the receiver so let's move it there.	2020-09-16 22:08:01 +02:00
Willy Tarreau	37d9d6721a	REORG: listener: create a new struct receiver In order to start to split the listeners into the listener part and the event receiver part, we introduce a new field "rx" into struct listener that will eventually become a separate struct receiver. This patch only adds the struct with an options field that the receivers will need.	2020-09-16 22:07:58 +02:00
Willy Tarreau	be56c1038f	MINOR: listener: move the network namespace to the struct settings The netns is common to all listeners/receivers and is used to bind the listening socket so it must be in the receiver settings and not in the listener. This removes some yet another set of unnecessary loops.	2020-09-16 20:13:13 +02:00
Willy Tarreau	7e307215e8	MINOR: listener: move the interface to the struct settings The interface is common to all listeners/receivers and is used to bind the listening socket so it must be in the receiver settings and not in the listener. This removes some unnecessary loops.	2020-09-16 20:13:13 +02:00
Willy Tarreau	e26993c098	MINOR: listener: move bind_proc and bind_thread to struct settings As mentioned previously, these two fields come under the settings struct since they'll be used to bind receivers as well.	2020-09-16 20:13:13 +02:00
Willy Tarreau	6e459d7f92	MINOR: listener: create a new struct "settings" in bind_conf There currently is a large inconsistency in how binding parameters are split between bind_conf and listeners. It happens that for historical reasons some parameters are available at the listener level but cannot be configured per-listener but only for a bind_conf, and thus, need to be replicated. In addition, some of the bind_conf parameters are in fact for the listening socket itself while others are for the instanciated sockets. A previous attempt at splitting listeners into receivers failed because the boundary between all these settings is not well defined. This patch introduces a level of listening socket settings in the bind_conf, that will be detachable later. Such settings that are solely for the listening socket are: - unix socket permissions (used only during binding) - interface (used for binding) - network namespace (used for binding) - process mask and thread mask (used during startup) The rest seems to be used only to initialize the resulting sockets, or to control the accept rate. For now, only the unix params (bind_conf->ux) were moved there.	2020-09-16 20:13:13 +02:00
William Lallemand	70bf06e5f0	BUILD: fix build with openssl < 1.0.2 since bundle removal Bundle removal broke the build with openssl version < 1.0.2. Remove the #ifdef around SSL_SOCK_KEYTYPE_NAMES.	2020-09-16 18:10:00 +02:00
William Lallemand	e7eb1fec2f	CLEANUP: ssl: remove utility functions for bundle Remove the last utility functions for handling the multi-cert bundles and remove the multi-variable from the ckch structure. With this patch, the bundles are completely removed.	2020-09-16 16:28:26 +02:00
William Lallemand	bd8e6eda59	CLEANUP: ssl: remove test on "multi" variable in ckch functions Since the removal of the multi-certificates bundle support, this variable is not useful anymore, we can remove all tests for this variable and suppose that every ckch contains a single certificate.	2020-09-16 16:28:26 +02:00
Willy Tarreau	441b6c31e9	BUILD: connection: fix build on clang after the VAR_ARRAY cleanup Commit `4987a4744` ("CLEANUP: tree-wide: use VAR_ARRAY instead of [0] in various definitions") broke the build on clang due to the tlv field used to receive/send the proxy protocol. The problem is that struct tlv is included at the beginning of struct tlv_ssl, which doesn't make much sense. In fact the value[] array isn't really a var array but just an end of struct marker, and must really be an array of size zero.	2020-09-14 08:43:51 +02:00
Willy Tarreau	4987a47446	CLEANUP: tree-wide: use VAR_ARRAY instead of [0] in various definitions Surprisingly there were still a number of [0] definitions for variable sized arrays in certain structures all over the code. We need to use VAR_ARRAY instead of zero to accommodate various compilers' preferences, as zero was used only on old ones and tends to report errors on new ones.	2020-09-12 20:56:41 +02:00
Ilya Shipitsin	4a034f2212	BUILD: introduce possibility to define ABORT_NOW() conditionally code analysis tools recognize abort() better, so let us introduce such possibility	2020-09-12 13:11:27 +02:00
Willy Tarreau	00c363ba9d	REORG: tools: move PARSE_OPT_* from tools.h to tools-t.h These would better be placed into the low-level type files with other similar macros.	2020-09-11 11:27:22 +02:00
Willy Tarreau	76296dce68	BUILD: trace: always have an argument before variadic args in macros tcc supports variadic macros provided that there is always at least one argument, like older gcc versions. Thus we need to always keep one and define args as the remaining ones. It's not an issue at all and doesn't change the way to use them, just the internal definitions.	2020-09-10 09:35:54 +02:00
Willy Tarreau	d966f1497c	BUILD: intops: on x86_64, the bswap instruction is called bswapq Building with tcc fails on "bswap" which in fact ought to be called "bswapq". Let's rename it as gas doesn't care.	2020-09-10 09:31:50 +02:00
Willy Tarreau	f6afda6539	BUILD: compiler: workaround a glibc madness around __attribute__() For whatever reason, glibc decided that the __attribute__ keyword is the exclusive property of gcc, and redefines it to an empty macro on other compilers. Some non-gcc compilers also support it (possibly partially), tinycc is one of them. By doing this, glibc silently broke all constructors, resulting in code that arrives in main() with uninitialized variables. The solution we use here consists in undefining the macro on non-gcc compilers, and redefining it to itself in order to cause a conflict in the event the redefinition would happen afterwards. This visibly solved the problem.	2020-09-10 09:26:50 +02:00
Willy Tarreau	d9537f6082	BUILD: compiler: reserve the gcc version checks to the gcc compiler Some checks on __GNUC__ imply that if it's undefined it will match a low value but that's not always what we want, like for example in the VAR_ARRAY definition which is not needed on tcc. Let's always be explicit on these tests.	2020-09-10 08:35:28 +02:00
Christopher Faulet	5a89175ac8	BUG/MEDIUM: dns: Don't store additional records in a linked-list A SRV record keeps a reference on the corresponding additional record, if any. But this additional record is also inserted in a separate linked-list into the dns response. The problems arise when obsolete additional records are released. The additional records list is purged but the SRV records always reference these objects, leading to an undefined behavior. Worst, this happens very quickly because additional records are never renewed. Thus, once received, an additional record will always expire. Now, the addtional record are only associated to a SRV record or simply ignored. And the last version is always used. This patch helps to fix the issue #841. It must be backported to 2.2.	2020-09-08 10:44:39 +02:00
Willy Tarreau	e91bff2134	MAJOR: init: start all listeners via protocols and not via proxies anymore Ever since the protocols were added in 1.3.13, listeners used to be started twice: - once by start_proxies(), which iteratees over all proxies then all listeners ; - once by protocol_bind_all() which iterates over all protocols then all listeners ; It's a real mess because error reporting is not even consistent, and more importantly now that some protocols do not appear in regular proxies (peers, logs), there is no way to retry their binding should it fail on the last step. What this patch does is to make sure that listeners are exclusively started by protocols. The failure to start a listener now causes the emission of an error indicating the proxy's name (as it used to be the case per proxy), and retryable failures are silently ignored during all but last attempts. The start_proxies() function was kept solely for setting the proxy's state to READY and emitting the "Proxy started" message and log that some have likely got used to seeking in their logs.	2020-09-02 11:11:43 +02:00
Willy Tarreau	576a633868	CLEANUP: protocol: remove all ->bind_all() and ->unbind_all() functions These ones were not used anymore since the two previous patches, let's drop them.	2020-09-02 10:40:33 +02:00
Christopher Faulet	bde2c4c621	MINOR: http-htx: Handle an optional reason when replacing the response status When calling the http_replace_res_status() function, an optional reason may now be set. It is ignored if it points to NULL and the original reason is preserved. Only the response status is replaced. Otherwise both the status and the reason are replaced. It simplifies the API and most of time, avoids an extra call to http_replace_res_reason().	2020-09-01 10:55:36 +02:00
Christopher Faulet	b8ce505c6f	MINOR: http-htx: Add an option to eval query-string when the path is replaced The http_replace_req_path() function now takes a third argument to evaluate the query-string as part of the path or to preserve it. If <with_qs> is set, the query-string is replaced with the path. Otherwise, only the path is replaced. This patch is mandatory to fix issue #829. The next commit depends on it. So be carefull during backports.	2020-09-01 10:55:14 +02:00
Willy Tarreau	9dbb6c43ce	MINOR: sock: distinguish dgram from stream types when retrieving old sockets For now we still don't retrieve dgram sockets, but the code must be able to distinguish them before we switch to receivers. This adds a new flag to the xfer_sock_list indicating that a socket is of type SOCK_DGRAM. The way to set the flag for now is by looking at the dummy address family which equals AF_CUST_UDP{4,6} in this case (given that other dgram sockets are not yet supported).	2020-08-28 19:26:39 +02:00
Willy Tarreau	a2c17877b3	MINOR: sock: do not use LI_O_* in xfer_sock_list anymore We'll want to store more info there and some info that are not represented in listener options at the moment (such as dgram vs stream) so let's get rid of these and instead use a new set of options (SOCK_XFER_OPT_*).	2020-08-28 19:26:38 +02:00
Willy Tarreau	429617459d	REORG: sock: move get_old_sockets() from haproxy.c The new function was called sock_get_old_sockets() and was left as-is except a minimum amount of style lifting to make it more readable. It will never be awesome anyway since it's used very early in the boot sequence and needs to perform socket I/O without any external help.	2020-08-28 19:24:55 +02:00
Willy Tarreau	37bafdcbb1	MINOR: sock_inet: move the IPv4/v6 transparent mode code to sock_inet This code was highly redundant, existing for TCP clients, TCP servers and UDP servers. Let's move it to sock_inet where it belongs. The new functions are sock_inet4_make_foreign() and sock_inet6_make_foreign().	2020-08-28 18:51:36 +02:00
Willy Tarreau	2d34a710b1	MINOR: sock: implement sock_find_compatible_fd() This is essentially a merge from tcp_find_compatible_fd() and uxst_find_compatible_fd() that relies on a listener's address and compare function and still checks for other variations. For AF_INET6 it compares a few of the listener's bind options. A minor change for UNIX sockets is that transparent mode, interface and namespace used to be ignored when trying to pick a previous socket while now if they are changed, the socket will not be reused. This could be refined but it's still better this way as there is no more risk of using a differently bound socket by accident. Eventually we should not pass a listener there but a set of binding parameters (address, interface, namespace etc...) which ultimately will be grouped into a receiver. For now this still doesn't exist so let's stick to the listener to break dependencies in the rest of the code.	2020-08-28 18:51:36 +02:00
Willy Tarreau	a6473ede5c	MINOR: sock: add interface and namespace length to xfer_sock_list This will ease and speed up comparisons in FD lookups.	2020-08-28 18:51:36 +02:00
Willy Tarreau	063d47d136	REORG: listener: move xfer_sock_list to sock.{c,h}. This will be used for receivers as well thus it is not specific to listeners but to sockets.	2020-08-28 18:51:36 +02:00
Willy Tarreau	e5bdc51bb5	REORG: sock_inet: move default_tcp_maxseg from proto_tcp.c Let's determine it at boot time instead of doing it on first use. It also saves us from having to keep it thread local. It's been moved to the new sock_inet_prepare() function, and the variables were renamed to sock_inet_tcp_maxseg_default and sock_inet6_tcp_maxseg_default.	2020-08-28 18:51:36 +02:00
Willy Tarreau	d88e8c06ac	REORG: sock_inet: move v6only_default from proto_tcp.c to sock_inet.c The v6only_default variable is not specific to TCP but to AF_INET6, so let's move it to the right file. It's now immediately filled on startup during the PREPARE stage so that it doesn't have to be tested each time. The variable's name was changed to sock_inet6_v6only_default.	2020-08-28 18:51:36 +02:00
Willy Tarreau	25140cc573	REORG: inet: replace tcp_is_foreign() with sock_inet_is_foreign() The function now makes it clear that it's independent on the socket type and solely relies on the address family. Note that it supports both IPv4 and IPv6 as we don't seem to need it per-family.	2020-08-28 18:51:36 +02:00
Willy Tarreau	c5a94c936b	MINOR: sock_inet: implement sock_inet_get_dst() This one is common to the TCPv4 and UDPv4 code, it retrieves the destination address of a socket, taking care of the possiblity that for an incoming connection the traffic was possibly redirected. The TCP and UDP definitions were updated to rely on it and remove duplicated code.	2020-08-28 18:51:36 +02:00
Willy Tarreau	f172558b27	MINOR: tcp/udp/unix: make use of proto->addrcmp() to compare addresses The new addrcmp() protocol member points to the function to be used to compare two addresses of the same family. When picking an FD from a previous process, we can now use the address specific address comparison functions instead of having to rely on a local implementation. This will help move that code to a more central place.	2020-08-28 18:51:36 +02:00
Willy Tarreau	0d06df6448	MINOR: sock: introduce sock_inet and sock_unix These files will regroup everything specific to AF_INET, AF_INET6 and AF_UNIX socket definitions and address management. Some code there might be agnostic to the socket type and could later move to af_xxxx.c but for now we only support regular sockets so no need to go too far. The files are quite poor at this step, they only contain the address comparison function for each address family.	2020-08-28 18:51:36 +02:00
Willy Tarreau	18b7df7a2b	REORG: sock: start to move some generic socket code to sock.c The new file sock.c will contain generic code for standard sockets relying on file descriptors. We currently have way too much duplication between proto_uxst, proto_tcp, proto_sockpair and proto_udp. For now only get_src, get_dst and sock_create_server_socket were moved, and are used where appropriate.	2020-08-28 18:51:36 +02:00
Willy Tarreau	478331dd93	CLEANUP: tcp: stop exporting smp_fetch_src() This is totally ugly, smp_fetch_src() is exported only so that stick_table.c can (ab)use it in the {sc,src}_* sample fetch functions. It could be argued that the sample could have been reconstructed there in place, but we don't even need to duplicate the code. We'd rather simply retrieve the "src" fetch's function from where it's used at init time and be done with it.	2020-08-28 18:51:36 +02:00
Willy Tarreau	bb1caff70f	MINOR: fd: add a new "exported" flag and use it for all regular listeners This new flag will be used to mark FDs that must be passed to any future process across the CLI's "_getsocks" command. The scheme here is quite complex and full of special cases: - FDs inherited from parent processes are not exported this way, as they are supposed to instead be passed by the master process itself across reloads. However such FDs ought never to be paused otherwise this would disrupt the socket in the parent process as well; - FDs resulting from a "bind" performed over a socket pair, which are in fact one side of a socket pair passed inside another control socket pair must not be passed either. Since all of them are used the same way, for now it's enough never to put this "exported" flag to FDs bound by the socketpair code. - FDs belonging to temporary listeners (e.g. a passive FTP data port) must not be passed either. Fortunately we don't have such FDs yet. - the rest of the listeners for now are made of TCP, UNIX stream, ABNS sockets and are exportable, so they get the flag. - UDP listeners were wrongly created as listeners and are not suitable here. Their FDs should be passed but for now they are not since the client doesn't even distinguish the SO_TYPE of the retrieved sockets. In addition, it's important to keep in mind that: - inherited FDs may never be closed in master process but may be closed in worker processes if the service is shut down (useless since still bound, but technically possible) ; - inherited FDs may not be disabled ; - exported FDs may be disabled because the caller will perform the subsequent listen() on them. However that might not work for all OSes - exported FDs may be closed, it just means the service was shut down from the worker, and will be rebound in the new process. This implies that we have to disable exported on close(). => as such, contrary to an apparently obvious equivalence, the "exported" status doesn't imply anything regarding the ability to close a listener's FD or not.	2020-08-26 18:33:52 +02:00
Willy Tarreau	63d8b6009b	CLEANUP: fd: remove fd_remove() and rename fd_dodelete() to fd_delete() This essentially undoes what we did in fd.c in 1.8 to support seamless reload. Since we don't need to remove an fd anymore we can turn fd_delete() to the simple function it used to be.	2020-08-26 18:33:52 +02:00
Willy Tarreau	bf3b06b03d	MINOR: reload: determine the foreing binding status from the socket Let's not look at the listener options passed by the original process and determine from the socket itself whether it is configured for transparent mode or not. This is cleaner and safer, and doesn't rely on flag values that could possibly change between versions.	2020-08-26 10:33:02 +02:00
Shimi Gersner	5846c490ce	MEDIUM: ssl: Support certificate chaining for certificate generation haproxy supports generating SSL certificates based on SNI using a provided CA signing certificate. Because CA certificates may be signed by multiple CAs, in some scenarios, it is neccesary for the server to attach the trust chain in addition to the generated certificate. The following patch adds the ability to serve the entire trust chain with the generated certificate. The chain is loaded from the provided `ca-sign-file` PEM file.	2020-08-25 16:36:06 +02:00
David Carlier	7adf8f35df	OPTIM: regex: PCRE2 use JIT match when JIT optimisation occured. When a regex had been succesfully compiled by the JIT pass, it is better to use the related match, thanksfully having same signature, for better performance. Signed-off-by: David Carlier <devnexen@gmail.com>	2020-08-14 07:53:40 +02:00
Christopher Faulet	d25d926806	MINOR: lua: Add support for userlist as fetches and converters arguments It means now http_auth() and http_auth_group() sample fetches are now exported to the lua.	2020-08-07 14:27:54 +02:00
Christopher Faulet	e02fc4d0dd	MINOR: arg: Add an argument type to keep a reference on opaque data The ARGT_PTR argument type may now be used to keep a reference to opaque data in the argument array used by sample fetches and converters. It is a generic way to point on data. I guess it could be used for some other arguments, like proxy, server, map or stick-table.	2020-08-07 14:20:07 +02:00
Ilya Shipitsin	6b79f38a7a	CLEANUP: assorted typo fixes in the code and comments This is 12th iteration of typo fixes	2020-07-31 11:18:07 +02:00
Christopher Faulet	2747fbb7ac	MEDIUM: tcp-rules: Use a dedicated expiration date for tcp ruleset A dedicated expiration date is now used to apply the inspect-delay of the tcp-request or tcp-response rulesets. Before, the analyse expiratation date was used but it may also be updated by the lua (at least). So a lua script may extend or reduce the inspect-delay by side effect. This is not expected. If it becomes necessary, a specific function will be added to do this. Because, for now, it is a bit confusing.	2020-07-30 09:31:09 +02:00
Christopher Faulet	810df06145	MEDIUM: htx: Add a flag on a HTX message when no more data are expected The HTX_FL_EOI flag must now be set on a HTX message when no more data are expected. Most of time, it must be set before adding the EOM block. Thus, if there is no space for the EOM, there is still an information to know all data were received and pushed in the HTX message. There is only an exception for the HTTP replies (deny, return...). For these messages, the flag is set after all blocks are pushed in the message, including the EOM block, because, on error, we remove all inserted data.	2020-07-22 16:43:32 +02:00
Willy Tarreau	f2452b3c70	MINOR: tasks/debug: add a BUG_ON() check to detect requeued task on free __task_free() cannot be called with a task still in the queue. This test adds a check which confirms there is no concurrency issue on such a case where a thread could requeue nor wakeup a task being freed.	2020-07-22 14:42:52 +02:00
Willy Tarreau	e5d79bccc0	MINOR: tasks/debug: add a few BUG_ON() to detect use of wrong timer queue This aims at catching calls to task_unlink_wq() performed by the wrong thread based on the shared status for the task, as well as calls to __task_queue() with the wrong timer queue being used based on the task's capabilities. This will at least help eliminate some hypothesis during debugging sessions when suspecting that a wrong thread has attempted to queue a task at the wrong place.	2020-07-22 14:42:52 +02:00
Willy Tarreau	2447bce554	MINOR: tasks/debug: make the thread affinity BUG_ON check a bit stricter The BUG_ON() test in task_queue() only tests for the case where we're queuing a task that doesn't run on the current thread. Let's refine it a bit further to catch all cases where the task does not run exactly on the current thread alone.	2020-07-22 14:22:38 +02:00
Emeric Brun	d3db3846c5	BUG/MEDIUM: resolve: fix init resolving for ring and peers section. Reported github issue #759 shows there is no name resolving on server lines for ring and peers sections. This patch introduce the resolving for those lines. This patch adds boolean a parameter to parse_server function to specify if we want the function to perform an initial name resolving using libc. This boolean is forced to true in case of peers or ring section. The boolean is kept to false in case of classic servers (from backend/listen) This patch should be backported in branches where peers sections support 'server' lines.	2020-07-21 17:59:20 +02:00
Emeric Brun	45c457a629	MINOR: log: adds counters on received syslog messages. This patch adds a global counter of received syslog messages and this one is exported on CLI "show info" as "CumRecvLogs". This patch also updates internal conn counter and freq of the listener and the proxy for each received log message to prepare a further export on the "show stats".	2020-07-15 17:50:12 +02:00
Emeric Brun	12941c82d0	MEDIUM: log: adds log forwarding section. Log forwarding: It is possible to declare one or multiple log forwarding section, haproxy will forward all received log messages to a log servers list. log-forward <name> Creates a new log forwarder proxy identified as <name>. bind <addr> [param*] Used to configure a log udp listener to receive messages to forward. Only udp listeners are allowed, address must be prefixed using 'udp@', 'udp4@' or 'udp6@'. This supports for all "bind" parameters found in 5.1 paragraph but most of them are irrelevant for udp/syslog case. log global log <address> [len <length>] [format <format>] [sample <ranges>:<smp_size>] <facility> [<level> [<minlevel>]] Used to configure target log servers. See more details on proxies documentation. If no format specified, haproxy tries to keep the incoming log format. Configured facility is ignored, except if incoming message does not present a facility but one is mandatory on the outgoing format. If there is no timestamp available in the input format, but the field exists in output format, haproxy will use the local date. Example: global log stderr format iso local7 ring myring description "My local buffer" format rfc5424 maxlen 1200 size 32764 timeout connect 5s timeout server 10s # syslog tcp server server mysyslogsrv 127.0.0.1:514 log-proto octet-count log-forward sylog-loadb bind udp4@127.0.0.1:1514 # all messages on stderr log global # all messages on local tcp syslog server log ring@myring local0 # load balance messages on 4 udp syslog servers log 127.0.0.1:10001 sample 1:4 local0 log 127.0.0.1:10002 sample 2:4 local0 log 127.0.0.1:10003 sample 3:4 local0 log 127.0.0.1:10004 sample 4:4 local0	2020-07-15 17:50:12 +02:00
Emeric Brun	54932b4408	MINOR: log: adds syslog udp message handler and parsing. This patch introduce a new fd handler used to parse syslog message on udp. The parsing function returns level, facility and metadata that can be immediatly reused to forward message to a log server. This handler is enabled on udp listeners if proxy is internally set to mode PR_MODE_SYSLOG	2020-07-15 17:50:12 +02:00
Emeric Brun	546488559a	MEDIUM: log/sink: re-work and merge of build message API. This patch merges build message code between sink and log and introduce a new API based on struct ist array to prepare message header with zero copy, targeting the log forwarding feature. Log format 'iso' and 'timed' are now avalaible on logs line. A new log format 'priority' is also added.	2020-07-15 17:50:12 +02:00
Emeric Brun	3835c0dcb5	MEDIUM: udp: adds minimal proto udp support for message listeners. This patch introduce proto_udp.c targeting a further support of log forwarding feature. This code was originally produced by Frederic Lecaille working on QUIC support and only minimal requirements for syslog support have been merged.	2020-07-15 17:50:12 +02:00
Christopher Faulet	aaa70852d9	MINOR: raw_sock: Report the number of bytes emitted using the splicing In the continuity of the commit `7cf0e4517` ("MINOR: raw_sock: report global traffic statistics"), we are now able to report the global number of bytes emitted using the splicing. It can be retrieved in "show info" output on the CLI. Note this counter is always declared, regardless the splicing support. This eases the integration with monitoring tools plugged on the CLI.	2020-07-15 14:08:14 +02:00
Christopher Faulet	0f9ff14b17	CLEANUP: connection: remove unused field idle_time from the connection struct Thanks to previous changes, this field is now unused.	2020-07-15 14:08:14 +02:00
Christopher Faulet	c6e7563b1a	MINOR: server: Factorize code to deal with connections removed from an idle list The srv_del_conn_from_list() function is now responsible to update the server counters and the connection flags when a connection is removed from an idle list (safe, idle or available). It is called when a connection is released or when a connection is set as private. This function also removes the connection from the idle list if necessary.	2020-07-15 14:08:14 +02:00
Christopher Faulet	3d52f0f1f8	MINOR: server: Factorize code to deal with reuse of server idle connections The srv_use_idle_conn() function is now responsible to update the server counters and the connection flags when an idle connection is reused. The same function is called when a new connection is created. This simplifies a bit the connect_server() function.	2020-07-15 14:08:14 +02:00
Christopher Faulet	15979619c4	MINOR: session: Take care to decrement idle_conns counter in session_unown_conn So conn_free() only calls session_unown_conn() if necessary. The details are now fully handled by session_unown_conn().	2020-07-15 14:08:14 +02:00
Christopher Faulet	236c93b108	MINOR: connection: Set the conncetion target during its initialisation When a new connection is created, its target is always set just after. So the connection target may set when it is created instead, during its initialisation to be precise. It is the purpose of this patch. Now, conn_new() function is called with the connection target as parameter. The target is then passed to conn_init(). It means the target must be passed when cs_new() is called. In this case, the target is only used when the conn-stream is created with no connection. This only happens for tcpchecks for now.	2020-07-15 14:08:14 +02:00
Christopher Faulet	fcc3d8a1c0	MINOR: connection: Use a dedicated function to look for a session's connection The session_get_conn() must now be used to look for an available connection matching a specific target for a given session. This simplifies a bit the connect_server() function.	2020-07-15 14:08:14 +02:00
Christopher Faulet	08016ab82d	MEDIUM: connection: Add private connections synchronously in session server list When a connection is marked as private, it is now added in the session server list. We don't wait a stream is detached from the mux to do so. When the connection is created, this happens after the mux creation. Otherwise, it is performed when the connection is marked as private. To allow that, when a connection is created, the session is systematically set as the connectin owner. Thus, a backend connection has always a owner during its creation. And a private connection has always a owner until its death. Note that outside the detach() callback, if the call to session_add_conn() failed, the error is ignored. In this situation, we retry to add the connection into the session server list in the detach() callback. If this fails at this step, the multiplexer is destroyed and the connection is closed.	2020-07-15 14:08:14 +02:00
Christopher Faulet	21ddc74e8a	MINOR: connection: Add a wrapper to mark a connection as private To set a connection as private, the conn_set_private() function must now be called. It sets the CO_FL_PRIVATE flags, but it also remove the connection from the available connection list, if necessary. For now, it never happens because only HTTP/1 connections may be set as private after their creation. And these connections are never inserted in the available connection list.	2020-07-15 14:08:14 +02:00
Willy Tarreau	a9d7b76f6a	MINOR: connection: use MT_LIST_ADDQ() to add connections to idle lists When a connection is added to an idle list, it's already detached and cannot be seen by two threads at once, so there's no point using TRY_ADDQ, there will never be any conflict. Let's just use the cheaper ADDQ.	2020-07-10 08:52:13 +02:00
Willy Tarreau	8689127816	MINOR: buffer: use MT_LIST_ADDQ() for buffer_wait lists additions The TRY_ADDQ there was not needed since the wait list is exclusively owned by the caller. There's a preliminary test on MT_LIST_ADDED() that might have been eliminated by keeping MT_LIST_TRY_ADDQ() but it would have required two more expensive writes before testing so better keep the test the way it is.	2020-07-10 08:52:13 +02:00
Willy Tarreau	de4db17dee	MINOR: lists: rename some MT_LIST operations to clarify them Initially when mt_lists were added, their purpose was to be used with the scheduler, where anyone may concurrently add the same tasklet, so it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their usage was extended and MT_LIST_ADD{,Q} started to be used on situations where the element to be added was exclusively owned by the one performing the operation so a conflict was impossible. This became more obvious with the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK. But this remains confusing and at many places it's not expected that an MT_LIST_ADD could possibly fail, and worse, at some places we start by initializing it before adding (and the test is superflous) so let's rename them to something more conventional to denote the presence of the check or not: MT_LIST_ADD{,Q} : inconditional operation, the caller owns the element, and doesn't care about the element's current state (exactly like LIST_ADD) MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not already added or in the process of being added. This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe" anymore. This also means that in case of backport mistakes in the future causing this to be overlooked, the slower and safer functions will still be used by default. Note that the missing unchecked MT_LIST_ADD macro was added. The rest of the code will have to be reviewed so that a number of callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove the unneeded test.	2020-07-10 08:50:41 +02:00
MIZUTA Takeshi	b24bc0dfb6	MINOR: tcp: Support TCP keepalive parameters customization It is now possible to customize TCP keepalive parameters. These correspond to the socket options TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL and are valid for the defaults, listen, frontend and backend sections. This patch fixes GitHub issue #670.	2020-07-09 05:22:16 +02:00
Willy Tarreau	3b8f9b7b88	BUG/MEDIUM: lists: add missing store barrier in MT_LIST_ADD/MT_LIST_ADDQ The torture test run for previous commit `787dc20` ("BUG/MEDIUM: lists: add missing store barrier on MT_LIST_BEHEAD()") finally broke again after 34M connections. It appeared that MT_LIST_ADD and MT_LIST_ADDQ were suffering from the same missing barrier when restoring the original pointers before giving up, when checking if the element was already added. This is indeed something which seldom happens with the shared scheduler, in case two threads simultaneously try to wake up the same tasklet. With a store barrier there after reverting the pointers, the torture test survived 750M connections on the NanoPI-Fire3, so it looks good this time. Probably that MT_LIST_BEHEAD should be added to test-list.c since it seems to be more sensitive to concurrent accesses with MT_LIST_ADDQ. It's worth noting that there is no barrier between the last two pointers update, while there is one in MT_LIST_POP and MT_LIST_BEHEAD, the latter having shown to be needed, but I cannot demonstrate why we would need one here. Given that the code seems solid here, let's stick to what is shown to work. This fix should be backported to 2.1, just for the sake of safety since the issue couldn't be triggered there, but it could change with the compiler or when backporting a fix for example.	2020-07-09 05:01:27 +02:00
Willy Tarreau	787dc20952	BUG/MEDIUM: lists: add missing store barrier on MT_LIST_BEHEAD() When running multi-threaded tests on my NanoPI-Fire3 (8 A53 cores), I managed to occasionally get either a bus error or a segfault in the scheduler, but only when running at a high connection rate (injecting on a tcp-request connection reject rule). The bug is rare and happens around once per million connections. I could never reproduce it with less than 4 threads nor on A72 cores. Haproxy 2.1.0 would also fail there but not 2.1.7. Every time the crash happened with the TL_URGENT task list corrupted, though it was not immediately after the LIST_SPLICE() call, indicating background activity survived the MT_LIST_BEHEAD() operation. This queue is where the shared runqueue is transferred, and the shared runqueue gets fast inter-thread tasklet wakeups from idle conn takeover and new connections. Comparing the MT_LIST_BEHEAD() and MT_LIST_DEL() implementations, it's quite obvious that a few barriers are missing from the former, and these will simply fail on weakly ordered caches. Two store barriers were added before the break() on failure, to match what is done on the normal path. Missing them almost always results in a segfault which is quite rare but consistent (after ~3M connections). The 3rd one before updating n->prev seems intuitively needed though I coudln't make the code fail without it. It's present in MT_LIST_DEL so better not be needlessly creative. The last one is the most important one, and seems to be the one that helps a concurrent MT_LIST_ADDQ() detect a late failure and try again. With this, the code survives at least 30M connections. Interestingly the exact same issue was addressed in 2.0-dev2 for MT_LIST_DEL with commit `690d2ad4d` ("BUG/MEDIUM: list: add missing store barriers when updating elements and head"). This fix must be backported to 2.1 as MT_LIST_BEHEAD() is also used there. It's only tagged as medium because it will only affect entry-level CPUs like Cortex A53 (x86 are not affected), and requires load levels that are very hard to achieve on such machines to trigger it. In practice it's unlikely anyone will ever hit it.	2020-07-08 19:45:50 +02:00
Willy Tarreau	e3cb9978c2	MINOR: version: back to development, update status message Update the status message and update INSTALL again.	2020-07-07 16:38:51 +02:00
Willy Tarreau	33205c23a7	[RELEASE] Released version 2.3-dev0 Released version 2.3-dev0 with the following main changes : - exact copy of 2.2.0	2020-07-07 16:35:28 +02:00
Willy Tarreau	44c47de81a	MINOR: version: mention that it's an LTS release now The new version is going to be LTS up to around Q2 2025.	2020-07-07 16:31:52 +02:00
William Lallemand	7d42ef5b22	WIP/MINOR: ssl: add sample fetches for keylog in frontend OpenSSL 1.1.1 provides a callback registering function SSL_CTX_set_keylog_callback, which allows one to receive a string containing the keys to deciphers TLSv1.3. Unfortunately it is not possible to store this data in binary form and we can only get this information using the callback. Which means that we need to store it until the connection is closed. This patches add 2 pools, the first one, pool_head_ssl_keylog is used to store a struct ssl_keylog which will be inserted as a ex_data in a SSL *. The second one is pool_head_ssl_keylog_str which will be used to store the hexadecimal strings. To enable the capture of the keys, you need to set "tune.ssl.keylog on" in your configuration. The following fetches were implemented: ssl_fc_client_early_traffic_secret, ssl_fc_client_handshake_traffic_secret, ssl_fc_server_handshake_traffic_secret, ssl_fc_client_traffic_secret_0, ssl_fc_server_traffic_secret_0, ssl_fc_exporter_secret, ssl_fc_early_exporter_secret	2020-07-06 19:08:03 +02:00
Ilya Shipitsin	46a030cdda	CLEANUP: assorted typo fixes in the code and comments This is 11th iteration of typo fixes	2020-07-06 14:34:32 +02:00
Willy Tarreau	b0be8ae2a8	CLEANUP: auth: fix useless self-include of auth-t.h Since recent include cleanups auth-t.h ended up including itself.	2020-07-05 21:32:47 +02:00
Willy Tarreau	0c439d8956	BUILD: tools: make resolve_sym_name() return a const Originally it was made to return a void* because some comparisons in the code where it was used required a lot of casts. But now we don't need that anymore. And having it non-const breaks the build on NetBSD 9 as reported in issue #728. So let's switch to const and adjust debug.c to accomodate this.	2020-07-05 20:26:04 +02:00
Olivier Houchard	a74bb7e26e	BUG/MEDIUM: connections: Let the xprt layer know a takeover happened. When we takeover a connection, let the xprt layer know. If it has its own tasklet, and it is already scheduled, then it has to be destroyed, otherwise it may run the new mux tasklet on the old thread. Note that we only do this for the ssl xprt for now, because the only other one that might wake the mux up is the handshake one, which is supposed to disappear before idle connections exist. No backport is needed, this is for 2.2.	2020-07-03 17:49:33 +02:00
Olivier Houchard	1662cdb0c6	BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover. In the various takeover() methods, make sure we schedule the old tasklet on the old thread, as we don't want it to run on our own thread! This was causing a very rare crash when building with DEBUG_STRICT, seeing that either an FD's thread mask didn't match the thread ID in h1_io_cb(), or that stream_int_notify() would try to queue a task with the wrong tid_bit. In order to reproduce this, it is necessary to maintain many connections (typically 30k) at a high request rate flowing over H1+SSL between two proxies, the second of which would randomly reject ~1% of the incoming connection and randomly killing some idle ones using a very short client timeout. The request rate must be adjusted so that the CPUs are nearly saturated, but never reach 100%. It's easier to reproduce this by skipping local connections and always picking from other threads. The issue should happen in less than 20s otherwise it's necessary to restart to reset the idle connections lists. No backport is needed, takeover() is 2.2 only.	2020-07-03 17:49:23 +02:00
Willy Tarreau	43079e0731	MINOR: sched: split tasklet_wakeup() into tasklet_wakeup_on() tasklet_wakeup() only checks tl->tid to know whether the task is programmed to run on the current thread or on a specific thread. We'll have to ease this selection in a subsequent patch, preferably without modifying tl->tid, so let's have a new tasklet_wakeup_on() function to specify the thread number to run on. That the logic has not changed at all.	2020-07-03 17:19:47 +02:00
Emeric Brun	9f9b22c4f1	MINOR: log: add time second fraction field to rfc5424 log timestamp. This patch adds the time second fraction in microseconds as supported by the rfc.	2020-07-02 17:56:06 +02:00
Willy Tarreau	dab586c3a8	BUILD: debug: avoid build warnings with DEBUG_MEM_STATS Some libcs define strdup() as a macro and cause redefine warnings to be emitted, so let's first undefine all functions we redefine.	2020-07-02 10:25:01 +02:00
Dragan Dosen	1e3b16f74f	MINOR: log-format: allow to preserve spacing in log format strings Now it's possible to preserve spacing everywhere except in "log-format", "log-format-sd" and "unique-id-format" directives, where spaces are delimiters and are merged. That may be useful when the response payload is specified as a log format string by "lf-file" or "lf-string", or even for headers or anything else. In order to merge spaces, a new option LOG_OPT_MERGE_SPACES is applied exclusively on options passed to function parse_logformat_string(). This patch fixes an issue #701 ("http-request return log-format file evaluation altering spacing of ASCII output/art").	2020-07-02 10:11:44 +02:00
Willy Tarreau	a6026a0c92	MINOR: debug: add a new "debug dev memstats" command Now when building with -DDEBUG_MEM_STATS, some malloc/calloc/strdup/realloc stats are kept per file+line number and may be displayed and even reset on the CLI using "debug dev memstats". This allows to easily track potential leakers or abnormal usages.	2020-07-02 09:14:48 +02:00
Willy Tarreau	76cc699017	MINOR: config: add a new tune.idle-pool.shared global setting. Enables ('on') or disables ('off') sharing of idle connection pools between threads for a same server. The default is to share them between threads in order to minimize the number of persistent connections to a server, and to optimize the connection reuse rate. But to help with debugging or when suspecting a bug in HAProxy around connection reuse, it can be convenient to forcefully disable this idle pool sharing between multiple threads, and force this option to "off". The default is on. This could have been nice to have during the idle connections debugging, but it's not too late to add it!	2020-07-01 19:07:37 +02:00
Olivier Houchard	ff1d0929b8	MEDIUM: connections: Don't use a lock when moving connections to remove. Make it so we don't have to take a lock while moving a connection from the idle list to the toremove_list by taking advantage of the MT_LIST.	2020-07-01 17:09:19 +02:00
Olivier Houchard	f8f4c2ef60	CLEANUP: connections: rename the toremove_lock to takeover_lock This lock was misnamed and a bit confusing. It's only used for takeover so let's call it takeover_lock.	2020-07-01 17:09:10 +02:00
Olivier Houchard	bbee1f7e78	MINOR: list: Add MT_LIST_DEL_SAFE_NOINIT() and MT_LIST_ADDQ_NOCHECK() Add two new macros, MT_LIST_DEL_SAFE_NOINIT makes sure we remove the element from the list, without reinitializing its next and prev, and MT_LIST_ADDQ_NOCHECK is similar to MT_LIST_ADDQ(), except it doesn't check if the element is already in a list. The goal is to be able to move an element from a list we're currently parsing to another, keeping it locked in the meanwhile.	2020-07-01 17:04:00 +02:00
Willy Tarreau	eb8c2c69fa	MEDIUM: sched: implement task_kill() to kill a task task_kill() may be used by any thread to kill any task with less overhead than a regular wakeup. In order to achieve this, it bypasses the priority tree and inserts the task directly into the shared tasklets list, cast as a tasklet. The task_list_size is updated to make sure it is properly decremented after execution of this task. The task will thus be picked by process_runnable_tasks() after checking the tree and sent to the TL_URGENT list, where it will be processed and killed. If the task is bound to more than one thread, its first thread will be the one notified. If the task was already queued or running, nothing is done, only the flag is added so that it gets killed before or after execution. Of course it's the caller's responsibility to make sur any resources allocated by this task were already cleaned up or taken over.	2020-07-01 16:35:53 +02:00
Willy Tarreau	8a6049c268	MEDIUM: sched: create a new TASK_KILLED task flag This flag, when set, will be used to indicate that the task must die. At the moment this may only be placed by the task itself or by the scheduler when placing it into the TL_NORMAL queue.	2020-07-01 16:35:49 +02:00
Willy Tarreau	364f25a688	MINOR: backend: don't always takeover from the same threads The next thread walking algorithm in commit `566df309c` ("MEDIUM: connections: Attempt to get idle connections from other threads.") proved to be sufficient for most cases, but it still has some rough edges when threads are unevenly loaded. If one thread wakes up with 10 streams to process in a burst, it will mainly take over connections from the next one until it doesn't have anymore. This patch implements a rotating index that is stored into the server list and that any thread taking over a connection is responsible for updating. This way it starts mostly random and avoids always picking from the same place. This results in a smoother distribution overall and a slightly lower takeover rate.	2020-07-01 16:07:43 +02:00
Willy Tarreau	2f3f4d3441	MEDIUM: server: add a new pool-low-conn server setting The problem with the way idle connections currently work is that it's easy for a thread to steal all of its siblings' connections, then release them, then it's done by another one, etc. This happens even more easily due to scheduling latencies, or merged events inside the same pool loop, which, when dealing with a fast server responding in sub-millisecond delays, can really result in one thread being fully at work at a time. In such a case, we perform a huge amount of takeover() which consumes CPU and requires quite some locking, sometimes resulting in lower performance than expected. In order to fight against this problem, this patch introduces a new server setting "pool-low-conn", whose purpose is to dictate when it is allowed to steal connections from a sibling. As long as the number of idle connections remains at least as high as this value, it is permitted to take over another connection. When the idle connection count becomes lower, a thread may only use its own connections or create a new one. By proceeding like this even with a low number (typically 2*nbthreads), we quickly end up in a situation where all active threads have a few connections. It then becomes possible to connect to a server without bothering other threads the vast majority of the time, while still being able to use these connections when the number of available FDs becomes low. We also use this threshold instead of global.nbthread in the connection release logic, allowing to keep more extra connections if needed. A test performed with 10000 concurrent HTTP/1 connections, 16 threads and 210 servers with 1 millisecond of server response time showed the following numbers: haproxy 2.1.7: 185000 requests per second haproxy 2.2: 314000 requests per second haproxy 2.2 lowconn 32: 352000 requests per second The takeover rate goes down from 300k/s to 13k/s. The difference is further amplified as the response time shrinks.	2020-07-01 15:23:15 +02:00
Willy Tarreau	35e30c9670	BUG/MINOR: server: fix the connection release logic regarding nearly full conditions There was a logic bug in commit `ddfe0743d` ("MEDIUM: server: use the two thresholds for the connection release algorithm"): instead of keeping only our first idle connection when FDs become scarce, the condition was inverted resulting in enforcing this constraint unless FDs are scarce. This results in less idle connections than permitted to be kept under normal condition. No backport needed.	2020-07-01 14:14:29 +02:00
Willy Tarreau	daf8aa62a8	MINOR: pools: increase MAX_BASE_POOLS to 64 When not sharing pools (i.e. when building with -DDEBUG_DONT_SHARE_POOLS) we have about 47 pools right now, while MAX_BASE_POOLS is only 32, meaning that only the first 32 ones will benefit from a per-thread cache entry. This totally kills performance when pools are not shared (roughly -20%). Let's double the limit to gain some margin, and make it possible to set it as a build option. It might be useful to backport this to stable versions as they're likely to be affected as well.	2020-06-30 14:29:02 +02:00
Willy Tarreau	ddfe0743d8	MEDIUM: server: use the two thresholds for the connection release algorithm The algorithm improvement in `bdb86bd` ("MEDIUM: server: improve estimate of the need for idle connections") is still not enough because there's a hard limit between below and above the FD count, so it continues to end up with many killed connections. Here we're proceeding differently. Given that there are two configured limits, a low and a high one, what we do is that we drop connections when the high limit is reached (what's already done by the killing task anyway), when we're between the low and the high threshold, we only keep the connection if our idle entries are empty (with a preference for safe ones), and below the low threshold, we keep any connection so as to give them a chance of being reused or taken over by another thread. Proceeding like this results in much less dropped connections, we typically see a 99.3% reuse rate (76k conns for 10M requests over 200 servers and 4 threads, with 335k takeovers or 3%), and much less CPU usage variations because there are no more bursts to try to kill extra connections. It should be possible to further improve this by counting the number of threads exploiting a server and trying to optimize the amount of per-thread idle connections so that it is approximately balanced among the threads.	2020-06-29 21:54:38 +02:00
Willy Tarreau	e69282a03f	BUG/MINOR: server: always count one idle slot for current thread The idle server connection estimates brought in commit `bdb86bd` ("MEDIUM: server: improve estimate of the need for idle connections") were committed without the minimum of 1 idle conn needed for the current thread. The net effect is that there are bursts of dropped connections when the load varies because there's no provision for the last connection. No backport needed, this is 2.2-dev.	2020-06-29 21:54:38 +02:00
Willy Tarreau	d59946e673	Revert "BUG/MEDIUM: lists: Lock the element while we check if it is in a list." This reverts previous commit 347bbf79d20e1cff57075a8a378355dfac2475e2i. The original code was correct. This patch resulted from a mistaken analysis and breaks the scheduler: ########################## Starting vtest ########################## Testing with haproxy version: 2.2-dev11-90b7d9-23 # top TEST reg-tests/lua/close_wait_lf.vtc TIMED OUT (kill -9) # top TEST reg-tests/lua/close_wait_lf.vtc FAILED (10.008) signal=9 1 tests failed, 0 tests skipped, 88 tests passed Program terminated with signal SIGABRT, Aborted. [Current thread is 1 (Thread 0x7fb0dac2c700 (LWP 11292))] (gdb) bt #0 0x00007fb0e7c143f8 in raise () from /lib64/libc.so.6 #1 0x00007fb0e7c15ffa in abort () from /lib64/libc.so.6 #2 0x000000000053f5d6 in ha_panic () at src/debug.c:269 #3 0x00000000005a6248 in wdt_handler (sig=14, si=<optimized out>, arg=<optimized out>) at src/wdt.c:119 #4 <signal handler called> #5 0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351 #6 listener_accept (fd=<optimized out>) at src/listener.c:999 #7 0x00000000004262df in fd_update_events (evts=<optimized out>, fd=6) at include/haproxy/fd.h:418 #8 _do_poll (p=<optimized out>, exp=<optimized out>, wake=<optimized out>) at src/ev_epoll.c:251 #9 0x0000000000548d0f in run_poll_loop () at src/haproxy.c:2949 #10 0x000000000054908b in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3067 #11 0x00007fb0e902b684 in start_thread () from /lib64/libpthread.so.0 #12 0x00007fb0e7ce5eed in clone () from /lib64/libc.so.6 (gdb) up #5 0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351 351 if (MT_LIST_ADDQ(&task_per_thread[tl->tid].shared_tasklet_list, (struct mt_list *)&tl->list) == 1) { If the commit above is ever backported, this one must be as well!	2020-06-29 21:54:37 +02:00
Olivier Houchard	347bbf79d2	BUG/MEDIUM: lists: Lock the element while we check if it is in a list. In MT_LIST_ADDQ() and MT_LIST_ADD() we can't just check if the element is already in a list, because there's a small race condition, it could be added between the time we checked, and the time we actually set its next and prev. So we have to lock it first. This should be backported to 2.1.	2020-06-29 19:59:06 +02:00
Willy Tarreau	a9fcecbdf3	MINOR: stats: add the estimated need of concurrent connections per server The max_used_conns value is used as an estimate of the needed number of connections on a server to know how many to keep open. But this one is not reported, making it hard to troubleshoot reuse issues. Let's export it in the sessions/current column.	2020-06-29 16:29:11 +02:00
Willy Tarreau	bdb86bdaab	MEDIUM: server: improve estimate of the need for idle connections Starting with commit `079cb9a` ("MEDIUM: connections: Revamp the way idle connections are killed") we started to improve the way to compute the need for idle connections. But the condition to keep a connection idle or drop it when releasing it was not updated. This often results in storms of close when certain thresholds are met, and long series of takeover() when there aren't enough connections left for a thread on a server. This patch tries to improve the situation this way: - it keeps an estimate of the number of connections needed for a server. This estimate is a copy of the max over previous purge period, or is a max of what is seen over current period; it differs from max_used_conns in that this one is a counter that's reset on each purge period ; - when releasing, if the number of current idle+used connections is lower than this last estimate, then we'll keep the connection; - when releasing, if the current thread's idle conns head is empty, and we don't exceed the estimate by the number of threads, then we'll keep the connection. - when cleaning up connections, we consider the max of the last two periods to avoid killing too many idle conns when facing bursty traffic. Thanks to this we can better converge towards a situation where, provided there are enough FDs, each active server keeps at least one idle connection per thread all the time, with a total number close to what was needed over the previous measurement period (as defined by pool-purge-delay). On tests with large numbers of concurrent connections (30k) and many servers (200), this has quite smoothed the CPU usage pattern, increased the reuse rate and roughly halved the takeover rate.	2020-06-29 16:29:10 +02:00
Willy Tarreau	b159132ea3	MINOR: activity: add per-thread statistics on FD takeover The FD takeover operation might have certain impacts explaining unexpected activities, so it's important to report such a counter there. We thus count the number of times a thread has stolen an FD from another thread.	2020-06-29 14:26:05 +02:00
Willy Tarreau	3bb617cfe0	MINOR: stats: add 3 new output values for the per-server idle conn state The servers have internal states describing the status of idle connections, unfortunately these were not exported in the stats. This patch adds the 3 following gauges: - idle_conn_cur : Current number of unsafe idle connections - safe_conn_cur : Current number of safe idle connections - used_conn_cur : Current number of connections in use	2020-06-29 14:26:05 +02:00
Willy Tarreau	20dc3cd4a6	MINOR: pools: move the LRU cache heads to thread_info The LRU cache head was an array of list, which causes false sharing between 4 to 8 threads in the same cache line. Let's move it to the thread_info structure instead. There's no need to do the same for the pool_cache[] array since it's already quite large (32 pointers each). By doing this the request rate increased by 1% on a 16-thread machine.	2020-06-29 10:36:37 +02:00
Willy Tarreau	c03d7632a5	CLEANUP: pool: only include the type files from types pool-t.h was mistakenly including the full-blown includes for threads, lists and api instead of the types, and as such, CONFIG_HAP_LOCAL_POOLS and CONFIG_HAP_LOCKLESS_POOLS were not visible everywhere.	2020-06-29 10:11:24 +02:00
Willy Tarreau	e4d1505c83	REORG: includes: create tinfo.h for the thread_info struct The thread_info struct is convenient to store various per-thread info without having to resort to a painful thread_local storage which is slow and painful to initialize. The problem is, by having this one in thread.h it's very difficult to add more entries there because everyone already includes thread.h so conversely thread.h cannot reference certain types. There's no point in having this there, instead let's create a new pair of files, tinfo{,-t}.h, which declare the structure. This way it will become possible to extend them with other includes and have certain files store their own types there.	2020-06-29 09:57:23 +02:00
Willy Tarreau	4d82bf5c2e	MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns We used to have 3 thread-based arrays for toremove_lock, idle_cleanup, and toremove_connections. The problem is that these items are small, and that this creates false sharing between threads since it's possible to pack up to 8-16 of these values into a single cache line. This can cause real damage where there is contention on the lock. This patch creates a new array of struct "idle_conns" that is aligned on a cache line and which contains all three members above. This way each thread has access to its variables without hindering the other ones. Just doing this increased the HTTP/1 request rate by 5% on a 16-thread machine. The definition was moved to connection.{c,h} since it appeared a more natural evolution of the ongoing changes given that there was already one of them declared in connection.h previously.	2020-06-28 10:52:36 +02:00
Willy Tarreau	d79422a0ff	BUG/MEDIUM: buffers: always allocate from the local cache first It looked strange to see pool_evict_from_cache() always very present on "perf top", but there was actually a reason to this: while b_free() uses pool_free() which properly disposes the buffer into the local cache and b_alloc_fast() allocates using pool_get_first() which considers the local cache, b_alloc_margin() does not consider the local cache as it only uses __pool_get_first() which only allocates from the shared pools. The impact is that basically everywhere a buffer is allocated (muxes, streams, applets), it's always picked from the shared pool (hence involves locking) and is released to the local one and makes it grow until it's required to trigger a flush using pool_evict_from_cache(). Buffers usage are thus not thread-local at all, and cause eviction of a lot of possibly useful objects from the local caches. Just fixing this results in a 10% request rate increase in an HTTP/1 test on a 16-thread machine. This bug was caused by recent commit `ed891fd` ("MEDIUM: memory: make local pools independent on lockless pools") merged into 2.2-dev9, so not backport is needed.	2020-06-28 10:45:35 +02:00
Willy Tarreau	4dc6c860b4	CLEANUP: buffers: remove unused buffer_wq_lock lock Commit `2104659` ("MEDIUM: buffer: remove the buffer_wq lock") removed usage of the lock but not the lock itself. It's totally unused, let's remove it.	2020-06-28 10:45:34 +02:00
Anthonin Bonnefoy	85048f80c9	MINOR: http: Add support for http 413 status Add 413 http "payload too large" status code. This will allow 413 to be used in deny_status and errorfile.	2020-06-26 11:30:02 +02:00
Ilya Shipitsin	47d17182f4	CLEANUP: assorted typo fixes in the code and comments This is 10th iteration of typo fixes	2020-06-26 11:27:28 +02:00
Ilya Shipitsin	f44d155515	BUILD: fix ssl_sample.c when building against BoringSSL BoringSSL does not have X509_get_X509_PUBKEY let our emulation level define that for BoringSSL as well Build log: src/ssl_sample.o: In function `smp_fetch_ssl_x_key_alg': /home/travis/build/haproxy/haproxy/src/ssl_sample.c:592: undefined reference to `X509_get_X509_PUBKEY' clang-7: error: linker command failed with exit code 1 (use -v to see invocation) Makefile:860: recipe for target 'haproxy' failed make: *** [haproxy] Error 1 travis-ci: https://travis-ci.com/github/haproxy/haproxy/jobs/351670996	2020-06-26 10:33:38 +02:00
Willy Tarreau	c54e5ad9cc	MINOR: cfgparse: sanitize the output a little bit With the rework of the config line parser, we've started to emit a dump of the initial line underlined by a caret character indicating the error location. But with extremely large lines it starts to take time and can even cause trouble to slow terminals (e.g. over ssh), and this becomes useless. In addition, control characters could be dumped as-is which is bad, especially when the input file is accidently wrong (an executable). This patch adds a string sanitization function which isolates an area around the error position in order to report only that area if the string is too large. The limit was set to 80 characters, which will result in roughly 40 chars around the error being reported only, prefixed and suffixed with "..." as needed. In addition, non-printable characters in the line are now replaced with '?' so as not to corrupt the terminal. This way invalid variable names, unmatched quotes etc will be easier to spot. A typical output is now: [ALERT] 176/092336 (23852) : parsing [bad.cfg:8]: forbidden first char in environment variable name at position 811957: ...c$PATH$PATH$d(xlc`%?$PATH$PATH$dgc?T$%$P?AH?$PATH$PATH$d(?$PATH$PATH$dgc?%... ^	2020-06-25 09:43:27 +02:00
Willy Tarreau	e7723bddd7	MEDIUM: tasks: add a tune.sched.low-latency option Now that all tasklet queues are scanned at once by run_tasks_from_lists(), it becomes possible to always check for lower priority classes and jump back to them when they exist. This patch adds tune.sched.low-latency global setting to enable this behavior. What it does is stick to the lowest ranked priority list in which tasks are still present with an available budget, and leave the loop to refill the tasklet lists if the trees got new tasks or if new work arrived into the shared urgent queue. Doing so allows to cut the latency in half when running with extremely deep run queues (10k-100k), thus allowing forwarding of small and large objects to coexist better. It remains off by default since it does have a small impact on large traffic by default (shorter batches).	2020-06-24 12:21:26 +02:00
Willy Tarreau	59153fef86	MINOR: tasks: make run_tasks_from_lists() scan the queues itself Now process_runnable_tasks is responsible for calculating the budgets for each queue, dequeuing from the tree, and calling run_tasks_from_lists(). This latter one scans the queues, picking tasks there and respecting budgets. Note that its name was updated with a plural "s" for this reason.	2020-06-24 12:21:26 +02:00
Willy Tarreau	ba48d5c8f9	MINOR: tasks: pass the queue index to run_task_from_list() Instead of passing it a pointer to the queue, pass it the queue's index so that it can perform all the work around current_queue and tl_class_mask.	2020-06-24 12:21:26 +02:00
Willy Tarreau	49f90bf148	MINOR: tasks: add a mask of the queues with active tasklets It is neither convenient nor scalable to check each and every tasklet queue to figure whether it's empty or not while we often need to check them all at once. This patch introduces a tasklet class mask which gets a bit 1 set for each queue representing one class of service. A single test on the mask allows to figure whether there's still some work to be done. It will later be usable to better factor the runqueue code. Bits are set when tasklets are queued. They're cleared when queues are emptied. It is possible that a queue is empty but has a bit if a tasklet was added then removed, but this is not a problem as this is properly checked for in run_tasks_from_list().	2020-06-24 12:21:26 +02:00
Willy Tarreau	c0a08ba2df	MINOR: tasks: make current_queue an index instead of a pointer It will be convenient to have the tasklet queue number soon, better make current_queue an index rather than a pointer to the queue. When not currently running (e.g. from I/O), the index is -1.	2020-06-24 12:21:26 +02:00
William Lallemand	ee8530c65e	MINOR: ssl: free the crtlist and the ckch during the deinit() Add some functions to deinit the whole crtlist and ckch architecture. It will free all crtlist, crtlist_entry, ckch_store, ckch_inst and their associated SNI, ssl_conf and SSL_CTX. The SSL_CTX in the default_ctx and initial_ctx still needs to be free'd separately.	2020-06-23 20:07:50 +02:00
William Lallemand	7df5c2dc3c	BUG/MEDIUM: ssl: fix ssl_bind_conf double free Since commit `2954c47` ("MEDIUM: ssl: allow crt-list caching"), the ssl_bind_conf is allocated directly in the crt-list, and the crt-list can be shared between several bind_conf. The deinit() code wasn't changed to handle that. This patch fixes the issue by removing the free of the ssl_conf in ssl_sock_free_all_ctx(). It should be completed with a patch that free the ssl_conf and the crt-list. Fix issue #700.	2020-06-23 20:06:55 +02:00
Willy Tarreau	5bd73063ab	BUG/MEDIUM: task: be careful not to run too many tasks at TL_URGENT A test on large objects revealed a big performance loss from 2.1. The cause was found to be related to cache locality between scheduled operations that are batched using tasklets. It happens that we now have several layers of tasklets and that queuing all these operations leaves time to let memory objects cool down in the CPU cache, effectively resulting in halving the performance. A quick test consisting in putting most unknown tasklets into the BULK queue almost fixed the performance regression, but this is a wrong approach as it can also slow down some low-latency transfers or access to applets like the CLI. What this patch does instead is to queue unknown tasklets into the same queue as the current one when tasklet_wakeup() is itself called from a task/tasklet, otherwise it uses urgent for real I/O (when sched->current is NULL). This results in the called tasklet being woken up much sooner, often at the end of the current batch of tasklets. By doing so, a test on 2 cores 4 threads with 256 concurrent H1 conns transferring 16m objects with 256kB buffers jumped from 55 to 88 Gbps. It's even possible to go as high as 101 Gbps by evaluating the URGENT queue after the BULK one, though this was not done as considered dangerous for latency sensitive operations. This reinforces the importance of getting back the CPU transfer mechanisms based on tasklet_wakeup_after() to work at the tasklet level by supporting an immediate wakeup in certain cases. No backport is needed, this is strictly 2.2.	2020-06-23 16:45:28 +02:00
Willy Tarreau	116ef223d2	MINOR: task: add a new pointer to current tasklet queue In task_per_thread[] we now have current_queue which is a pointer to the current tasklet_list entry being evaluated. This will be used to know the class under which the current task/tasklet is currently running.	2020-06-23 16:35:38 +02:00
Willy Tarreau	38e8a1c7b8	MINOR: debug: add a new DEBUG_FD build option When DEBUG_FD is set at build time, we'll keep a counter of per-FD events in the fdtab. This counter is reported in "show fd" even for closed FDs if not zero. The purpose is to help spot situations where an apparently closed FD continues to be reported in loops, or where some events are dismissed.	2020-06-23 10:04:54 +02:00
Willy Tarreau	d1d005d7f6	MEDIUM: map: make the "clear map" operation yield As reported in issue #419, a "clear map" operation on a very large map can take a lot of time and freeze the entire process for several seconds. This patch makes sure that pat_ref_prune() can regularly yield after clearing some entries so that the rest of the process continues to work. The first part, the removal of the patterns, can take quite some time by itself in one run but it's still relatively fast. It may block for up to 100ms for 16M IP addresses in a tree typically. This change needed to declare an I/O handler for the clear operation so that we can get back to it after yielding. The second part can be much slower because it deconstructs the elements and its users, but it iterates progressively so we can yield less often here. The patch was tested with traffic in parallel sollicitating the map being released and showed no problem. Some traffic will definitely notice an incomplete map but the filling is already not atomic anyway thus this is not different. It may be backported to stable versions once sufficiently tested for side effects, at least as far as 2.0 in order to avoid the watchdog triggering when the process is frozen there. For a better behaviour, all these prune_* functions should support yielding so that the callers have a chance to continue also yield in turn.	2020-06-19 16:57:51 +02:00
Willy Tarreau	bc52bec163	MEDIUM: fd: add experimental support for edge-triggered polling Some of the recent optimizations around the polling to save a few epoll_ctl() calls have shown that they could also cause some trouble. However, over time our code base has become totally asynchronous with I/Os always attempted from the upper layers and only retried at the bottom, making it look like we're getting closer to EPOLLET support. There are showstoppers there such as the listeners which cannot support this. But given that most of the epoll_ctl() dance comes from the connections, we can try to enable edge-triggered polling on connections. What this patch does is to add a new global tunable "tune.fd.edge-triggered", that makes fd_insert() automatically set an et_possible bit on the fd if the I/O callback is conn_fd_handler. When the epoll code sees an update for such an FD, it immediately registers it in both directions the first time and doesn't update it anymore. On a few tests it proved quite useful with a 14% request rate increase in a H2->H1 scenario, reducing the epoll_ctl() calls from 2 per request to 2 per connection. The option is obviously disabled by default as bugs are still expected, particularly around the subscribe() code where it is possible that some layers do not always re-attempt reading data after being woken up.	2020-06-19 14:21:46 +02:00
Dragan Dosen	13cd54c08b	MEDIUM: peers: add the "localpeer" global option localpeer <name> Sets the local instance's peer name. It will be ignored if the "-L" command line argument is specified or if used after "peers" section definitions. In such cases, a warning message will be emitted during the configuration parsing. This option will also set the HAPROXY_LOCALPEER environment variable. See also "-L" in the management guide and "peers" section in the configuration manual.	2020-06-19 11:37:30 +02:00
Dragan Dosen	4f01415d3b	MINOR: peers: do not use localpeer as an array anymore It is now dynamically allocated by using strdup().	2020-06-19 11:37:11 +02:00
Willy Tarreau	7af4fa9a48	MINOR: activity: rename the "stream" field to "stream_calls" This one was confusingly called, I thought it was the cumulated number of streams but it's the number of calls to process_stream(). Let's make this clearer.	2020-06-17 20:52:29 +02:00
Willy Tarreau	e406386542	MINOR: activity: rename confusing poll_* fields in the output We have poll_drop, poll_dead and poll_skip which are confusingly named like their poll_io and poll_exp counterparts except that they are not per poll() call but per-fd. This patch renames them to poll_drop_fd(), poll_dead_fd() and poll_skip_fd() for this reason.	2020-06-17 20:35:33 +02:00
Willy Tarreau	e545153c50	MINOR: activity: report the number of times poll() reports I/O The "show activity" output mentions a number of indicators to explain wake up reasons but doesn't have the number of times poll() sees some I/O. And given that multiple events can happen simultaneously, it's not always possible to deduce this metric by subtracting. This patch adds a new "poll_io" counter that allows one to see how often poll() returns with at least one active FD. This should help detect stuck events and measure various ratios of poll sub-metrics.	2020-06-17 20:25:18 +02:00
Willy Tarreau	c208a54ab2	DOC: fd: make it clear that some fields ordering must absolutely be respected fd_set_running() and fd_takeover() may both use a double-word CAS on the (running_mask, thread_mask) couple and as such they expect the fields to be exactly arranged like this. It's critical not to reorder them, so add a comment to avoid such a potential mistake later.	2020-06-17 19:58:37 +02:00
Willy Tarreau	4f72ec851c	CLEANUP: activity: remove unused counter fd_lock Since 2.1-dev2, with commit `305d5ab46` ("MAJOR: fd: Get rid of the fd cache.") we don't have the fd_lock anymore and as such its acitvity counter is always zero. Let's remove it from the struct and from "show activity" output, as there are already plenty of indicators to look at. The cache line comment in the struct activity was updated to reflect reality as it looks like another one already got removed in the past.	2020-06-17 19:15:51 +02:00
Willy Tarreau	6d4c81db96	MINOR: compiler: always define __has_feature() This macro is provided by clang but gcc lacks it. Not having it makes it painful to test features on both compilers. Better define it to zero when not available so that __has_feature(foo) never errors.	2020-06-16 19:13:24 +02:00
Willy Tarreau	c8d167bcfb	MINOR: tools: add a new configurable line parse, parse_line() This function takes on input a string to tokenize, an output storage (which may be the same) and a number of options indicating how to handle certain characters (single & double quote support, backslash support, end of line on '#', environment variables etc). On output it will provide a list of pointers to individual words after having possibly unescaped some character sequences, handled quotes and resolved environment variables, and it will also indicate a status made of: - a list of failures (overlap between src/dst, wrong quote etc) - the pointer to the first sequence in error - the required output length (a-la snprintf()). This allows a caller to freely unescape/unquote a string by using a pre-allocated temporary buffer and expand it as necessary. It takes extreme care at avoiding expensive operations and intentionally does not use memmove() when removing escapes, hence the reason for the different input and output buffers. The goal is to use it as the basis for the config parser.	2020-06-16 16:27:26 +02:00
Willy Tarreau	853926a9ac	BUG/MEDIUM: ebtree: use a byte-per-byte memcmp() to compare memory blocks As reported in issue #689, there is a subtle bug in the ebtree code used to compared memory blocks. It stems from the platform-dependent memcmp() implementation. Original implementations used to perform a byte-per-byte comparison and to stop at the first non-matching byte, as in this old example: https://www.retro11.de/ouxr/211bsd/usr/src/lib/libc/compat-sys5/memcmp.c.html The ebtree code has been relying on this to detect the first non-matching byte when comparing keys. This is made so that a zero-terminated string can fail to match against a longer string. Over time, especially with large busses and SIMD instruction sets, multi-byte comparisons have appeared, making the processor fetch bytes past the first different byte, which could possibly be a trailing zero. This means that it's possible to read past the allocated area for a string if it was allocated by strdup(). This is not correct and definitely confuses address sanitizers. In real life the problem doesn't have visible consequences. Indeed, multi-byte comparisons are implemented so that aligned words are loaded (e.g. 512 bits at once to process a cache line at a time). So there is no way such a multi-byte access will cross a page boundary and end up reading from an unallocated zone. This is why it was never noticed before. This patch addresses this by implementing a one-byte-at-a-time memcmp() variant for ebtree, called eb_memcmp(). It's optimized for both small and long strings and guarantees to stop after the first non-matching byte. It only needs 5 instructions in the loop and was measured to be 3.2 times faster than the glibc's AVX2-optimized memcmp() on short strings (1 to 257 bytes), since that latter one comes with a significant setup cost. The break-even seems to be at 512 bytes where both version perform equally, which is way longer than what's used in general here. This fix should be backported to stable versions and reintegrated into the ebtree code.	2020-06-16 11:30:33 +02:00
Willy Tarreau	f3ca5a0273	BUILD: haproxy: mark deinit_and_exit() as noreturn Commit `0a3b43d9c` ("MINOR: haproxy: Make use of deinit_and_exit() for clean exits") introduced this build warning: src/haproxy.c: In function 'main': src/haproxy.c:3775:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ This is because the new deinit_and_exit() is not marked as "noreturn" so depending on the optimizations, the noreturn attribute of exit() will either leak through it and silence the warning or not and confuse the compiler. Let's just add the attribute to fix this. No backport is needed, this is purely 2.2.	2020-06-15 18:43:46 +02:00
Willy Tarreau	bcefb85009	BUILD: atomic: add string.h for memcpy() on ARM64 As reported in issue #686, ARM64 build fails since the include files reorganization. This is caused by the lack of string.h while a memcpy() is present in __ha_cas_dw().	2020-06-14 08:08:13 +02:00
Tim Duesterhus	2654055316	MINOR: haproxy: Add void deinit_and_exit(int) This helper function calls deinit() and then exit() with the given status.	2020-06-14 07:39:42 +02:00
Willy Tarreau	db57a142c3	BUILD: thread: add parenthesis around values of locking macros clang just failed on fd.c with this error: src/fd.c:491:9: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses] while (HA_SPIN_TRYLOCK(OTHER_LOCK, &log_lock) != 0) { ^ ~~ That's because this expands to this: while (!pl_try_s(&log_lock) != 0) { Let's just add parenthesis in the TRYLOCK macros to avoid this. This may need to be backported if commit `df187875d` ("BUG/MEDIUM: log: don't hold the log lock during writev() on a file descriptor") is backported as well as it seems to be the first one to trigger it.	2020-06-12 11:46:44 +02:00
Willy Tarreau	7c18b54106	REORG: dgram: rename proto_udp to dgram The set of files proto_udp.{c,h} were misleadingly named, as they do not provide anything related to the UDP protocol but to datagram handling instead, since currently all UDP processing is hard-coded where it's used (dns, logs). They are to UDP what connection.{c,h} are to proto_tcp. This was causing confusion about how to insert UDP socket management code, so let's rename them right now to dgram.{c,h} which more accurately matches what's inside since every function and type is already prefixed with "dgram_".	2020-06-11 10:18:59 +02:00
Willy Tarreau	e5793916f0	REORG: include: make list-t.h part of the base API There are list definitions everywhere in the code, let's drop the need for including list-t.h to declare them. The rest of the list manipulation is huge however and not needed everywhere so using the list walking macros still requires to include list.h.	2020-06-11 10:18:59 +02:00
Willy Tarreau	b2551057af	CLEANUP: include: tree-wide alphabetical sort of include files This patch fixes all the leftovers from the include cleanup campaign. There were not that many (~400 entries in ~150 files) but it was definitely worth doing it as it revealed a few duplicates.	2020-06-11 10:18:59 +02:00
Willy Tarreau	5b9cde4820	REORG: include: move THREAD_LOCAL and __decl_thread() to compiler.h Since these are used as type attributes or conditional clauses, they are used about everywhere and should not require a dependency on thread.h. Moving them to compiler.h along with other similar statements like ALIGN() etc looks more logical; this way they become part of the base API. This allowed to remove thread-t.h from ~12 files, one was found to only require thread-t and not thread and dict.c was found to require thread.h.	2020-06-11 10:18:59 +02:00
Willy Tarreau	ca8b069aa7	REORG: include: move MAX_THREADS to defaults.h That's already where MAX_PROCS is set, and we already handle the case of the default value so there is no reason for placing it in thread.h given that most call places don't need the rest of the threads definitions. The include was removed from global-t.h and activity.c.	2020-06-11 10:18:59 +02:00
Willy Tarreau	6784c99463	CLEANUP: include: make atomic.h part of the base API Atomic ops are used about everywhere, let's make them part of the base API by including atomic.h in api.h.	2020-06-11 10:18:59 +02:00
Willy Tarreau	8e3f5c6661	CLEANUP: compiler: add a THREAD_ALIGNED macro and use it where appropriate Sometimes we need to align a struct member or a struct's size only when threads are enabled. This is the case on fdtab for example. Instead of using ugly ifdefs in the code itself, let's have a THREAD_ALIGNED() macro performing the alignment only when threads are enabled. For now this was only applied to fd-t.h as it was the only place found.	2020-06-11 10:18:59 +02:00
Willy Tarreau	36979d9ad5	REORG: include: move the error reporting functions to from log.h to errors.h Most of the files dealing with error reports have to include log.h in order to access ha_alert(), ha_warning() etc. But while these functions don't depend on anything, log.h depends on a lot of stuff because it deals with log-formats and samples. As a result it's impossible not to embark long dependencies when using ha_warning() or qfprintf(). This patch moves these low-level functions to errors.h, which already defines the error codes used at the same places. About half of the users of log.h could be adjusted, sometimes revealing other issues such as missing tools.h. Interestingly the total preprocessed size shrunk by 4%.	2020-06-11 10:18:59 +02:00
Willy Tarreau	251c2aae06	CLEANUP: include: move sample_data out of sample-t.h The struct sample_data is used by pattern, map and vars, and currently requires to include sample-t which comes with many other dependencies. Let's move sample_data into its own file to shorten the dependency tree. This revealed a number of issues in adjacent files which were hidden by the fact that sample-t.h brought everything that was missing.	2020-06-11 10:18:59 +02:00
Willy Tarreau	4f663ec022	CLEANUP: include: don't include proxy-t.h in global-t.h We only need a forward declaration here to avoid embarking lots of files, and by just doing this we reduce the build size by 3.5%.	2020-06-11 10:18:59 +02:00
Willy Tarreau	d62af6abe4	CLEANUP: include: don't include stddef.h directly Directly including stddef.h in many files results in it being processed multiple times while it can be centralized in api-t.h and be guarded against multiple inclusions. Doing so reduces the number of preprocessed lines by 1200!	2020-06-11 10:18:59 +02:00
Willy Tarreau	bcc6733fab	REORG: check: extract the external checks from check.{c,h} The health check code is ugly enough, let's take the external checks out of it to simplify the code and shrink the file a little bit.	2020-06-11 10:18:58 +02:00
Willy Tarreau	d604ace940	REORG: check: move email_alert* from proxy-t.h to mailers-t.h These ones are specific to mailers and have nothing to do in proxy-t.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	51cd5956ee	REORG: check: move tcpchecks away from check.c Checks.c remains one of the largest file of the project and it contains too many things. The tcpchecks code represents half of this file, and both parts are relatively isolated, so let's move it away into its own file. We now have tcpcheck.c, tcpcheck{,-t}.h. Doing so required to export quite a number of functions because check.c has almost everything made static, which really doesn't help to split!	2020-06-11 10:18:58 +02:00
Willy Tarreau	cee013e4e0	REORG: check: move the e-mail alerting code to mailers.c check.c is one of the largest file and contains too many things. The e-mail alerting code is stored there while nothing is in mailers.c. Let's move this code out. That's only 4% of the code but a good start. In order to do so, a few tcp-check functions had to be exported.	2020-06-11 10:18:58 +02:00
Willy Tarreau	4f6535d734	CLEANUP: hpack: export debug functions and move inlines to .h When building contrib/hpack there is a warning about an unused static function. Actually it makes no sense to make it static, instead it must be regularly exported. Similarly there is hpack_dht_get_tail() which is inlined in the C file and which would make more sense with all other ones in the H file.	2020-06-11 10:18:58 +02:00
Willy Tarreau	6be7849f39	REORG: include: move cfgparse.h to haproxy/cfgparse.h There's no point splitting the file in two since only cfgparse uses the types defined there. A few call places were updated and cleaned up. All of them were in C files which register keywords. There is nothing left in common/ now so this directory must not be used anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	dfd3de8826	REORG: include: move stream.h to haproxy/stream{,-t}.h This one was not easy because it was embarking many includes with it, which other files would automatically find. At least global.h, arg.h and tools.h were identified. 93 total locations were identified, 8 additional includes had to be added. In the rare files where it was possible to finalize the sorting of includes by adjusting only one or two extra lines, it was done. But all files would need to be rechecked and cleaned up now. It was the last set of files in types/ and proto/ and these directories must not be reused anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	1e56f92693	REORG: include: move server.h to haproxy/server{,-t}.h extern struct dict server_name_dict was moved from the type file to the main file. A handful of inlined functions were moved at the bottom of the file. Call places were updated to use server-t.h when relevant, or to simply drop the entry when not needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	a55c45470f	REORG: include: move queue.h to haproxy/queue{,-t}.h Nothing outstanding here. A number of call places were not justified and removed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	4980160ecc	REORG: include: move backend.h to haproxy/backend{,-t}.h The files remained mostly unchanged since they were OK. However, half of the users didn't need to include them, and about as many actually needed to have it and used to find functions like srv_currently_usable() through a long chain that broke when moving the file.	2020-06-11 10:18:58 +02:00
Willy Tarreau	6c58ab0304	REORG: include: move spoe.h to haproxy/spoe{,-t}.h Only minor change was to make sure all defines were before the structs in spoe-t.h, everything else went smoothly.	2020-06-11 10:18:58 +02:00
Willy Tarreau	a264d960f6	REORG: include: move proxy.h to haproxy/proxy{,-t}.h This one is particularly difficult to split because it provides all the functions used to manipulate a proxy state and to retrieve names or IDs for error reporting, and as such, it was included in 73 files (down to 68 after cleanup). It would deserve a small cleanup though the cut points are not obvious at the moment given the number of structs involved in the struct proxy itself.	2020-06-11 10:18:58 +02:00
Willy Tarreau	aeed4a85d6	REORG: include: move log.h to haproxy/log{,-t}.h The current state of the logging is a real mess. The main problem is that almost all files include log.h just in order to have access to the alert/warning functions like ha_alert() etc, and don't care about logs. But log.h also deals with real logging as well as log-format and depends on stream.h and various other things. As such it forces a few heavy files like stream.h to be loaded early and to hide missing dependencies depending where it's loaded. Among the missing ones is syslog.h which was often automatically included resulting in no less than 3 users missing it. Among 76 users, only 5 could be removed, and probably 70 don't need the full set of dependencies. A good approach would consist in splitting that file in 3 parts: - one for error output ("errors" ?). - one for log_format processing - and one for actual logging.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c6599682d5	REORG: include: move fcgi-app.h to haproxy/fcgi-app{,-t}.h Only arg-t.h was missing from the types to get arg_list.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c7babd8570	REORG: include: move filters.h to haproxy/filters{,-t}.h Just a minor change, moved the macro definitions upwards. A few caller files were updated since they didn't need to include it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c2b1ff04e5	REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h It was moved without any change, however many callers didn't need it at all. This was a consequence of the split of proto_http.c into several parts that resulted in many locations to still reference it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	f1d32c475c	REORG: include: move channel.h to haproxy/channel{,-t}.h The files were moved with no change. The callers were cleaned up a bit and a few of them had channel.h removed since not needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	5e539c9b8d	REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h Almost no changes, removed stdlib and added buf-t and connection-t to the types to avoid a warning.	2020-06-11 10:18:58 +02:00
Willy Tarreau	209108dbbd	REORG: include: move ssl_sock.h to haproxy/ssl_sock{,-t}.h Almost nothing changed, just moved a static inline at the end and moved an export from the types to the main file.	2020-06-11 10:18:58 +02:00
Willy Tarreau	2867159d63	REORG: include: move lb_map.h to haproxy/lb_map{,-t}.h Nothing was changed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	83487a833c	REORG: include: move cli.h to haproxy/cli{,-t}.h Almost no change except moving the cli_kw struct definition after the defines. Almost all users had both types&proto included, which is not surprizing since this code is old and it used to be the norm a decade ago. These places were cleaned.	2020-06-11 10:18:58 +02:00
Willy Tarreau	2eec9b5f95	REORG: include: move stats.h to haproxy/stats{,-t}.h Just some minor reordering, and the usual cleanup of call places for those which didn't need it. We don't include the whole tools.h into stats-t anymore but just tools-t.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	3f0f82e7a9	REORG: move applet.h to haproxy/applet{,-t}.h The type file was slightly tidied. The cli-specific APPCTX_CLI_ST1_* flag definitions were moved to cli.h. The type file was adjusted to include buf-t.h and not the huge buf.h. A few call places were fixed because they did not need this include.	2020-06-11 10:18:58 +02:00
Willy Tarreau	8c42b8a147	REORG: include: split common/uri_auth.h into haproxy/uri_auth{,-t}.h Initially it looked like this could have been placed into auth.h or stats.h but it's not the case as it's what makes the link between them and the HTTP layer. However the file needed to be split in two. Quite a number of call places were dropped because these were mostly leftovers from the early days where the stats and cli were packed together.	2020-06-11 10:18:58 +02:00
Willy Tarreau	dcc048a14a	REORG: include: move acl.h to haproxy/acl.h{,-t}.h The files were moved almost as-is, just dropping arg-t and auth-t from acl-t but keeping arg-t in acl.h. It was useful to revisit the call places since a handful of files used to continue to include acl.h while they did not need it at all. Struct stream was only made a forward declaration since not otherwise needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c6d61d762f	REORG: include: move trace.h to haproxy/trace{,-t}.h Only thread-t was added to satisfy THREAD_LOCAL but the rest was OK.	2020-06-11 10:18:58 +02:00
Willy Tarreau	48d25b3bc9	REORG: include: move session.h to haproxy/session{,-t}.h Almost no change was needed beyond a little bit of reordering of the types file and adjustments to use session-t instead of session at a few places.	2020-06-11 10:18:58 +02:00
Willy Tarreau	872f2ea209	REORG: include: move stick_table.h to haproxy/stick_table{,-t}.h The stktable_types[] array declaration was moved to the main file as it had nothing to do in the types. A few declarations were reordered in the types file so that defines were before the structs. Thread-t was added since there are a few __decl_thread(). The loss of peers.h revealed that cfgparse-listen needed it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	3c2a7c2788	REORG: include: move peers.h to haproxy/peers{,-t}.h The cfg_peers external declaration was moved to the main file instead of the type one. A few types were still missing from the proto, causing warnings in the functions prototypes (proxy, stick_table).	2020-06-11 10:18:58 +02:00
Willy Tarreau	126ba3a1e1	REORG: include: move http_fetch.h to haproxy/http_fetch.h There's no type file for this trivial one. The unneeded dependency on htx.h was dropped.	2020-06-11 10:18:58 +02:00
Willy Tarreau	4aa573da6f	REORG: include: move checks.h to haproxy/check{,-t}.h All includes that were not absolutely necessary were removed because checks.h happens to very often be part of dependency loops. A warning was added about this in check-t.h. The fields, enums and structs were a bit tidied because it's particularly tedious to find anything there. It would make sense to split this in two or more files (at least extract tcp-checks). The file was renamed to the singular because it was one of the rare exceptions to have an "s" appended to its name compared to the struct name.	2020-06-11 10:18:58 +02:00
Willy Tarreau	7ea393d95e	REORG: include: move connection.h to haproxy/connection{,-t}.h The type file is becoming a mess, half of it is for the proxy protocol, another good part describes conn_streams and mux ops, it would deserve being split again. At least it was reordered so that elements are easier to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved to compat.h as it's said to be the value for Linux.	2020-06-11 10:18:58 +02:00
Willy Tarreau	8b550afe1e	REORG: include: move tcp_rules.h to haproxy/tcp_rules.h There's no type file on this one which is pretty simple.	2020-06-11 10:18:58 +02:00
Willy Tarreau	3727a8a083	REORG: include: move signal.h to haproxy/signal{,-t}.h No change was necessary. Include from wdt.c was dropped since unneeded.	2020-06-11 10:18:58 +02:00
Willy Tarreau	fc77454aff	REORG: include: move proto_tcp.h to haproxy/proto_tcp.h There was no type file. This one really is trivial. A few missing includes were added to satisfy the exported functions prototypes.	2020-06-11 10:18:58 +02:00
Willy Tarreau	cea0e1bb19	REORG: include: move task.h to haproxy/task{,-t}.h The TASK_IS_TASKLET() macro was moved to the proto file instead of the type one. The proto part was a bit reordered to remove a number of ugly forward declaration of static inline functions. About a tens of C and H files had their dependency dropped since they were not using anything from task.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	f268ee8795	REORG: include: split global.h into haproxy/global{,-t}.h global.h was one of the messiest files, it has accumulated tons of implicit dependencies and declares many globals that make almost all other file include it. It managed to silence a dependency loop between server.h and proxy.h by being well placed to pre-define the required structs, forcing struct proxy and struct server to be forward-declared in a significant number of files. It was split in to, one which is the global struct definition and the few macros and flags, and the rest containing the functions prototypes. The UNIX_MAX_PATH definition was moved to compat.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	a171892501	REORG: include: move vars.h to haproxy/vars{,-t}.h A few includes (sessions.h, stream.h, api-t.h) were added for arguments that were first declared in function prototypes.	2020-06-11 10:18:58 +02:00
Willy Tarreau	b23e5958ed	REORG: include: move protocol_buffers.h to haproxy/protobuf{,-t}.h There is no C file for this one, the code was placed into sample.c which thus has a dependency on this file which itself includes sample.h. Probably that it would be wise to split that later.	2020-06-11 10:18:58 +02:00
Willy Tarreau	e6ce10be85	REORG: include: move sample.h to haproxy/sample{,-t}.h This one is particularly tricky to move because everyone uses it and it depends on a lot of other types. For example it cannot include arg-t.h and must absolutely only rely on forward declarations to avoid dependency loops between vars -> sample_data -> arg. In order to address this one, it would be nice to split the sample_data part out of sample.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	469509b39e	REORG: include: move payload.h to haproxy/payload.h There's no type file, it only contains fetch_rdp_cookie_name() and val_payload_lv() which probably ought to move somewhere else instead of staying there.	2020-06-11 10:18:58 +02:00
Willy Tarreau	2cd5809f94	REORG: include: move map to haproxy/map{,-t}.h Only small cleanups, and removal of a few includes from files that didn't need them.	2020-06-11 10:18:58 +02:00
Willy Tarreau	225a90aaec	REORG: include: move pattern.h to haproxy/pattern{,-t}.h It was moved as-is, except for extern declaration of pattern_reference. A few C files used to include it but didn't need it anymore after having been split apart so this was cleaned.	2020-06-11 10:18:58 +02:00
Willy Tarreau	213e99073b	REORG: include: move listener.h to haproxy/listener{,-t}.h stdlib and list were missing from listener.h, otherwise it was OK.	2020-06-11 10:18:58 +02:00
Willy Tarreau	546ba42c73	REORG: include: move lb_fwrr.h to haproxy/lb_fwrr{,-t}.h Nothing fancy, includes were already OK. The proto didn't reference the type, this was fixed. Still references proxy.h and server.h from types/.	2020-06-11 10:18:58 +02:00
Willy Tarreau	0254941666	REORG: include: move lb_fwlc.h to haproxy/lb_fwlc{,-t}.h Nothing fancy, includes were already OK. The proto didn't reference the type, this was fixed. Still references proxy.h and server.h from types/.	2020-06-11 10:18:58 +02:00
Willy Tarreau	b5fc3bf6dc	REORG: include: move lb_fas.h to haproxy/lb_fas{,-t}.h Nothing fancy, includes were already OK. The proto didn't reference the type, this was fixed. Still references proxy.h and server.h from types/.	2020-06-11 10:18:58 +02:00
Willy Tarreau	fbe8da3320	REORG: include: move lb_chash.h to haproxy/lb_chash{,-t}.h Nothing fancy, includes were already OK. The proto didn't reference the type, this was fixed. Still references proxy.h and server.h from types/.	2020-06-11 10:18:58 +02:00
Willy Tarreau	52d88725ab	REORG: move ssl_crtlist.h to haproxy/ssl_crtlist{,-t}.h These files were already clean as well. Just added ebptnode which is needed in crtlist_entry.	2020-06-11 10:18:58 +02:00
Willy Tarreau	47d7f9064d	REORG: include: move ssl_ckch.h to haproxy/ssl_ckch{,-t}.h buf-t and ebmbtree were included.	2020-06-11 10:18:58 +02:00
Willy Tarreau	b2bd865804	REORG: include: move ssl_utils.h to haproxy/ssl_utils.h Just added buf-t and openssl-compat for the missing types that appear in the prototypes.	2020-06-11 10:18:57 +02:00
Willy Tarreau	b5abe5bd5d	REORG: include: move mworker.h to haproxy/mworker{,-t}.h One function prototype makes reference to struct mworker_proc which was not defined there but in global.h instead. This definition, along with the PROC_O_* fields were moved to mworker-t.h instead.	2020-06-11 10:18:57 +02:00
Willy Tarreau	d7d2c28104	CLEANUP: include: remove unused mux_pt.h It used to be needed to export mux_pt_ops when it was the only way to detect a mux but that's no longer the case.	2020-06-11 10:18:57 +02:00
Willy Tarreau	c761f843da	REORG: include: move http_rules.h to haproxy/http_rules.h There was no include file. This one still includes types/proxy.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	8efbdfb77b	REORG: include: move obj_type.h to haproxy/obj_type{,-t}.h No change was necessary. It still includes lots of types/* files.	2020-06-11 10:18:57 +02:00
Willy Tarreau	762d7a5117	REORG: include: move frontend.h to haproxy/frontend.h There was no type file for this one, it only contains frontend_accept().	2020-06-11 10:18:57 +02:00
Willy Tarreau	278161c1b8	REORG: include: move capture.h to haproxy/capture{,-t}.h The file was split into two since it contains a variable declaration.	2020-06-11 10:18:57 +02:00
Willy Tarreau	cc9bbfb7b5	REORG: include: split mailers.h into haproxy/mailers{,-t}.h The file mostly contained struct definitions but there was also a variable export. Most of the stuff currently lies in checks.h and should definitely move here!	2020-06-11 10:18:57 +02:00
Willy Tarreau	167e1eb7c7	REORG: include: move counters.h to haproxy/counters-t.h Since these are only type definitions, let's move them to counters-t.h and reserve counters.h for when functions will be needed.	2020-06-11 10:18:57 +02:00
Willy Tarreau	7d865a5e3e	REORG: include: move flt_http_comp.h to haproxy/ There was no type definition for this file which was moved as-is.	2020-06-11 10:18:57 +02:00
Willy Tarreau	eb92deb500	REORG: include: move dns.h to haproxy/dns{,-t}.h The files were moved as-is.	2020-06-11 10:18:57 +02:00
Willy Tarreau	ac13aeaa89	REORG: include: move auth.h to haproxy/auth{,-t}.h The STATS_DEFAULT_REALM and STATS_DEFAULT_URI were moved to defaults.h. It was required to include types/pattern.h and types/sample.h since they are mentioned in function prototypes. It would be wise to merge this with uri_auth.h later.	2020-06-11 10:18:57 +02:00
Willy Tarreau	aa74c4e1b3	REORG: include: move arg.h to haproxy/arg{,-t}.h Almost no change was needed; chunk.h was replaced with buf-t.h. It dpeends on types/vars.h and types/protocol_buffers.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	122eba92b7	REORG: include: move action.h to haproxy/action{,-t}.h List.h was missing for LIST_ADDQ(). A few unneeded includes of action.h were removed from certain files. This one still relies on applet.h and stick-table.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	8c794000c4	REORG: include: move hlua_fcn.h to haproxy/hlua_fcn.h Added lua.h which was missing from the includes.	2020-06-11 10:18:57 +02:00
Willy Tarreau	8641605ff6	REORG: include: move hlua.h to haproxy/hlua{,-t}.h This one required a few more includes as it uses list and ebpt_node. It still references lots of types/ files for now.	2020-06-11 10:18:57 +02:00
Willy Tarreau	87735330d1	REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h A few includes had to be added, namely list-t.h in the type file and types/proxy.h in the proto file. actions.h was including http-htx.h but didn't need it so it was dropped.	2020-06-11 10:18:57 +02:00
Willy Tarreau	c6fe884c74	REORG: include: move h1_htx.h to haproxy/h1_htx.h This one didn't have a type file. A few missing includes were added (htx, types).	2020-06-11 10:18:57 +02:00
Willy Tarreau	0a3bd3919e	REORG: include: move compression.h to haproxy/compression{,-t}.h No change was needed.	2020-06-11 10:18:57 +02:00
Willy Tarreau	f07f30c15f	REORG: include: move proto/proto_sockpair.h to haproxy/proto_sockpair.h This one didn't have any types file and was moved as-is.	2020-06-11 10:18:57 +02:00
Willy Tarreau	832ce65914	REORG: include: move proto_udp.h to haproxy/proto_udp{,-t}.h No change was needed.	2020-06-11 10:18:57 +02:00
Willy Tarreau	14e8af5932	CLEANUP: include: remove empty raw_sock.h This one only contained an include for types/stream_interface.h, which was already present in its 3 users.	2020-06-11 10:18:57 +02:00
Willy Tarreau	551271d99c	REORG: include: move pipe.h to haproxy/pipe{,-t}.h No change was needed beyond a minor cleanup.	2020-06-11 10:18:57 +02:00
Willy Tarreau	ba2f73d40e	REORG: include: move sink.h to haproxy/sink{,-t}.h The sink files could be moved with almost no change at since they didn't rely on anything fancy. ssize_t required sys/types.h and thread.h was needed for the locks.	2020-06-11 10:18:57 +02:00
Willy Tarreau	d2ad57c352	REORG: include: move ring to haproxy/ring{,-t}.h Some includes were wrong in the type definition but beyond this no change was needed.	2020-06-11 10:18:57 +02:00
Willy Tarreau	0f6ffd652e	REORG: include: move fd.h to haproxy/fd{,-t}.h A few includes were missing in each file. A definition of struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was moved to defaults.h Stdio used to be silently inherited from whatever path but it's needed for list_pollers() which takes a FILE* and which can thus not be forward-declared.	2020-06-11 10:18:57 +02:00
Willy Tarreau	fc8f6a8517	REORG: include: move port_range.h to haproxy/port_range{,-t}.h The port ranges didn't depend on anything. However they were missing some includes such as stdlib and api-t.h which were added.	2020-06-11 10:18:57 +02:00
Willy Tarreau	334099c324	REORG: include: move shctx to haproxy/shctx{,-t}.h Minor cleanups were applied, some includes were missing from the types file and some were incorrect in a few C files (duplicated or not using path).	2020-06-11 10:18:57 +02:00
Willy Tarreau	3afc4c4bb0	REORG: include: move dict.h to hparoxy/dict{,-t}.h This was entirely free-standing. haproxy/api-t.h was added for size_t.	2020-06-11 10:18:57 +02:00
Willy Tarreau	48fbcae07c	REORG: tools: split common/standard.h into haproxy/tools{,-t}.h And also rename standard.c to tools.c. The original split between tools.h and standard.h dates from version 1.3-dev and was mostly an accident. This patch moves the files back to what they were expected to be, and takes care of not changing anything else. However this time tools.h was split between functions and types, because it contains a small number of commonly used macros and structures (e.g. name_desc) which in turn cause the massive list of includes of tools.h to conflict with the callers. They remain the ugliest files of the whole project and definitely need to be cleaned and split apart. A few types are defined there only for functions provided there, and some parts are even OS-specific and should move somewhere else, such as the symbol resolution code.	2020-06-11 10:18:57 +02:00
Willy Tarreau	2dd7c35052	REORG: include: move protocol.h to haproxy/protocol{,-t}.h The protocol.h files are pretty low in the dependency and (sadly) used by some files from common/. Almost nothing was changed except lifting a few comments.	2020-06-11 10:18:57 +02:00
Willy Tarreau	fa2ef5b5eb	REORG: include: move common/fcgi.h to haproxy/ The file was moved almost verbatim (only stdio.h was dropped as useless). It was not split between types and functions because it's only included from direct C code (fcgi.c and mux_fcgi.c) as well as fcgi_app.h, included from the same ones, which should also be remerged as a single one.	2020-06-11 10:18:57 +02:00
Willy Tarreau	bf0731491b	REORG: include: move common/h2.h to haproxy/h2.h No change was performed, the file is only included from C files and currently doesn't need to be split into types+functions.	2020-06-11 10:18:57 +02:00
Willy Tarreau	be327fa332	REORG: include: move hpack*.h to haproxy/ and split hpack-tbl The various hpack files are self-contained, but hpack-tbl was one of those showing difficulties when pools were added because that began to add quite some dependencies. Now when built in standalone mode, it still uses the bare minimum pool definitions and doesn't require to know the prototypes anymore when only the structures are needed. Thus the files were moved verbatim except for hpack-tbl which was split between types and prototypes.	2020-06-11 10:18:57 +02:00
Willy Tarreau	16f958c0e9	REORG: include: split common/htx.h into haproxy/htx{,-t}.h Most of the file was a large set of HTX elements manipulation functions and few types, so splitting them allowed to further reduce dependencies and shrink the build time. Doing so revealed that a few files (h2.c, mux_pt.c) needed haproxy/buf.h and were previously getting it through htx.h. They were fixed.	2020-06-11 10:18:57 +02:00
Willy Tarreau	5413a87ad3	REORG: include: move common/h1.h to haproxy/h1.h The file was moved as-is. There was a wrong dependency on dynbuf.h instead of buf.h which was addressed. There was no benefit to splitting this between types and functions.	2020-06-11 10:18:57 +02:00
Willy Tarreau	0017be0143	REORG: include: split common/http-hdr.h into haproxy/http-hdr{,-t}.h There's only one struct and 2 inline functions. It could have been merged into http.h but that would have added a massive dependency on the hpack parts for nothing, so better keep it this way since hpack is already freestanding and portable.	2020-06-11 10:18:57 +02:00
Willy Tarreau	cd72d8c981	REORG: include: split common/http.h into haproxy/http{,-t}.h So the enums and structs were placed into http-t.h and the functions into http.h. This revealed that several files were dependeng on http.h but not including it, as it was silently inherited via other files.	2020-06-11 10:18:57 +02:00
Willy Tarreau	c2f7c5895c	REORG: include: move common/ticks.h to haproxy/ticks.h Nothing needed to be changed, there are no exported types.	2020-06-11 10:18:57 +02:00
Willy Tarreau	374b442cbc	REORG: include: split common/xref.h into haproxy/xref{,-t}.h The type is the only element needed by applet.h and hlua.h, while hlua.c needs the various functions. XREF_BUSY was placed into the types as well since it's better to have the special values there.	2020-06-11 10:18:57 +02:00
Willy Tarreau	7cd8b6e3a4	REORG: include: split common/regex.h into haproxy/regex{,-t}.h Regex are essentially included for myregex_t but it turns out that several of the C files didn't include it directly, relying on the one included by their own .h. This has been cleanly addressed so that only the type is included by H files which need it, and adding the missing includes for the other ones.	2020-06-11 10:18:57 +02:00
Willy Tarreau	7a00efbe43	REORG: include: move common/namespace.h to haproxy/namespace{,-t}.h The type was moved out as it's used by standard.h for netns_entry. Instead of just being a forward declaration when not used, it's an empty struct, which makes gdb happier (the resulting stripped executable is the same).	2020-06-11 10:18:57 +02:00
Willy Tarreau	6131d6a731	REORG: include: move common/net_helper.h to haproxy/net_helper.h No change was necessary.	2020-06-11 10:18:57 +02:00
Willy Tarreau	2741c8c4aa	REORG: include: move common/buffer.h to haproxy/dynbuf{,-t}.h The pretty confusing "buffer.h" was in fact not the place to look for the definition of "struct buffer" but the one responsible for dynamic buffer allocation. As such it defines the struct buffer_wait and the few functions to allocate a buffer or wait for one. This patch moves it renaming it to dynbuf.h. The type definition was moved to its own file since it's included in a number of other structs. Doing this cleanup revealed that a significant number of files used to rely on this one to inherit struct buffer through it but didn't need anything from this file at all.	2020-06-11 10:18:57 +02:00
Willy Tarreau	a04ded58dc	REORG: include: move activity to haproxy/ This moves types/activity.h to haproxy/activity-t.h and proto/activity.h to haproxy/activity.h. The macros defining the bit field values for the profiling variable were moved to the type file to be more future-proof.	2020-06-11 10:18:57 +02:00
Willy Tarreau	c13ed53b12	REORG: include: move common/chunk.h to haproxy/chunk.h No change was necessary, it was already properly split.	2020-06-11 10:18:57 +02:00
Willy Tarreau	d0ef439699	REORG: include: move common/memory.h to haproxy/pool.h Now the file is ready to be stored into its final destination. A few minor reorderings were performed to keep the file properly organized, making the various sections more visible (cache & lockless). In addition and to stay consistent, memory.c was renamed to pool.c.	2020-06-11 10:18:57 +02:00
Willy Tarreau	ed891fda52	MEDIUM: memory: make local pools independent on lockless pools Till now the local pool caches were implemented only when lockless pools were in use. This was mainly due to the difficulties to disentangle the code parts. However the locked pools would further benefit from the local cache, and having this would reduce the variants in the code. This patch does just this. It adds a new debug macro DEBUG_NO_LOCAL_POOLS to forcefully disable local pool caches, and makes sure that the high level functions are now strictly the same between locked and lockless (pool_alloc(), pool_alloc_dirty(), pool_free(), pool_get_first()). The pool index calculation was moved inside the CONFIG_HAP_LOCAL_POOLS guards. This allowed to move them out of the giant #ifdef and to significantly reduce the code duplication. A quick perf test shows that with locked pools the performance increases by roughly 10% on 8 threads and gets closer to the lockless one.	2020-06-11 10:18:57 +02:00
Willy Tarreau	f8c1b648c0	MINOR: memory: move pool-specific path of the locked pool_free() to __pool_free() pool_free() was not identical between locked and lockless pools. The different was the call to __pool_free() in one case versus open-coded accesses in the other, and the poisoning brought by commit `da52035a45` ("MINOR: memory: also poison the area on freeing") which unfortunately did if only for the lockless path. Let's now have __pool_free() to work on the global pool also in the locked case so that the code is architected similarly.	2020-06-11 10:18:56 +02:00
Willy Tarreau	fb117e6a8e	MEDIUM: memory: don't let pool_put_to_cache() free the objects itself Just as for the allocation path, the release path was not symmetrical. It was not logical to have pool_put_to_cache() free the objects itself, it was pool_free's job. In addition, just because of a variable export issue, it the insertion of the object to free back into the local cache couldn't be inlined while it was very cheap. This patch just slightly reorganizes this code path by making pool_free() decide whether or not to put the object back into the cache via pool_put_to_cache() otherwise place it back to the global pool using __pool_free(). Then pool_put_to_cache() adds the item to the local cache and only calls pool_evict_from_cache() if the cache is too big.	2020-06-11 10:18:56 +02:00
Willy Tarreau	a6982e5868	MINOR: memory: don't let __pool_get_first() pick from the cache When building with the local cache support, we have an asymmetry in the allocation path which is that __pool_get_first() picks from the cache while when no cache support is used, this one directly accesses the shared area. It looks like it was done this way only to centralize the call to __pool_get_from_cache() but this was not a good idea as it complicates the splitting the code. Let's move the cache access to the upper layer so thatt __pool_get_first() remains agnostic to the cache support. The call tree now looks like this with the cache enabled : pool_get_first() __pool_get_from_cache() // if cache enabled __pool_get_first() pool_alloc() pool_alloc_dirty() __pool_get_from_cache() // if cache enabled __pool_get_first() __pool_refill_alloc() __pool_free() pool_free_area() pool_put_to_cache() __pool_free() __pool_put_to_cache() pool_free() pool_put_to_cache() With cache disabled, the pool_free() path still differs: pool_free() __pool_free_area() __pool_put_to_cache()	2020-06-11 10:18:56 +02:00
Willy Tarreau	24aa1eebaa	REORG: memory: move the OS-level allocator to haproxy/pool-os.h The memory.h file is particularly complex due to the combination of debugging options. This patch extracts the OS-level interface and places it into a new file: pool-os.h. Doing this also moves pool_alloc_area() and pool_free_area() out of the #ifndef CONFIG_HAP_LOCKLESS_POOLS, making them usable from __pool_refill_alloc(), pool_free(), pool_flush() and pool_gc() instead of having direct calls to malloc/free there that are hard to wrap for debugging purposes.	2020-06-11 10:18:56 +02:00
Willy Tarreau	3646777a77	REORG: memory: move the pool type definitions to haproxy/pool-t.h This is the beginning of the move and cleanup of memory.h. This first step only extracts type definitions and basic macros that are needed by the files which reference a pool. They're moved to pool-t.h (since "pool" is more obvious than "memory" when looking for pool-related stuff). 3 files which didn't need to include the whole memory.h were updated.	2020-06-11 10:18:56 +02:00
Willy Tarreau	606135ac88	CLEANUP: pool: include freq_ctr.h and remove locally duplicated functions In memory.h we had to reimplement the swrate* functions just because of a broken circular dependency around freq_ctr.h. Now that this one is solved, let's get rid of this copy and use the original ones instead.	2020-06-11 10:18:56 +02:00
Willy Tarreau	6634794992	REORG: include: move freq_ctr to haproxy/ types/freq_ctr.h was moved to haproxy/freq_ctr-t.h and proto/freq_ctr.h was moved to haproxy/freq_ctr.h. Files were updated accordingly, no other change was applied.	2020-06-11 10:18:56 +02:00
Willy Tarreau	889faf467b	CLEANUP: include: remove excessive includes of common/standard.h Some of them were simply removed as unused (possibly some leftovers from an older cleanup session), some were turned to haproxy/bitops.h and a few had to be added (hlua.c and stick-table.h need standard.h for parse_time_err; htx.h requires chunk.h but used to get it through standard.h).	2020-06-11 10:18:56 +02:00
Willy Tarreau	aea4635c38	REORG: include: move integer manipulation functions from standard.h to intops.h There are quite a number of integer manipulation functions defined in standard.h, which is one of the reasons why standard.h is included from many places and participates to the dependencies loop. Let's just have a new file, intops.h to place all these operations. These are a few bitops, 32/64 bit mul/div/rotate, integer parsing and encoding (including varints), the full avalanche hash function, and the my_htonll/my_ntohll functions. For now no new C file was created for these yet.	2020-06-11 10:18:56 +02:00
Willy Tarreau	92b4f1372e	REORG: include: move time.h from common/ to haproxy/ This one is included almost everywhere and used to rely on a few other .h that are not needed (unistd, stdlib, standard.h). It could possibly make sense to split it into multiple parts to distinguish operations performed on timers and the internal time accounting, but at this point it does not appear much important.	2020-06-11 10:18:56 +02:00
Willy Tarreau	af613e8359	CLEANUP: thread: rename __decl_hathreads() to __decl_thread() I can never figure whether it takes an "s" or not, and in the end it's better if it matches the file's naming, so let's call it "__decl_thread".	2020-06-11 10:18:56 +02:00
Willy Tarreau	3f567e4949	REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h This splits the hathreads.h file into types+macros and functions. Given that most users of this file used to include it only to get the definition of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h (i.e. types and macros). All the thread management was left to haproxy/thread.h. It's worth noting the drop of the trailing "s" in the name, to remove the permanent confusion that arises between this one and the system implementation (no "s") and the makefile's option (no "s"). For consistency, src/hathreads.c was also renamed thread.c. A number of files were updated to only include thread-t which is the one they really needed. Some future improvements are possible like replacing empty inlined functions with macros for the thread-less case, as building at -O0 disables inlining and causes these ones to be emitted. But this really is cosmetic.	2020-06-11 10:18:56 +02:00
Willy Tarreau	5775d0964a	CLEANUP: threads: remove a few needless includes of hathreads.h A few files were including it while not needing it (anymore). Some only required access to the atomic ops and got haproxy/atomic.h in exchange. Others didn't need it at all. A significant number of files still include it only for THREAD_LOCAL definition.	2020-06-11 10:18:56 +02:00
Willy Tarreau	9453ecd670	REORG: threads: extract atomic ops from hathreads.h The hathreads.h file has quickly become a total mess because it contains thread definitions, atomic operations and locking operations, all this for multiple combinations of threads, debugging and architectures, and all this done with random ordering! This first patch extracts all the atomic ops code from hathreads.h to move it to haproxy/atomic.h. The code there still contains several sections based on non-thread vs thread, and GCC versions in the latter case. Each section was arranged in the exact same order to ease finding. The redundant HA_BARRIER() which was the same as __ha_compiler_barrier() was dropped in favor of the latter which follows the naming convention of all other barriers. It was only used in freq_ctr.c which was updated. Additionally, __ha_compiler_barrier() was defined inconditionally but used only for thread-related operations, so it was made thread-only like HA_BARRIER() used to be. We'd still need to have two types of compiler barriers, one for the general case (e.g. signals) and another one for concurrency, but this was not addressed here. Some comments were added at the beginning of each section to inform about the use case and warn about the traps to avoid. Some files which continue to include hathreads.h solely for atomic ops should now be updated.	2020-06-11 10:18:56 +02:00
Willy Tarreau	853b297c9b	REORG: include: split mini-clist into haproxy/list and list-t.h Half of the users of this include only need the type definitions and not the manipulation macros nor the inline functions. Moves the various types into mini-clist-t.h makes the files cleaner. The other one had all its includes grouped at the top. A few files continued to reference it without using it and were cleaned. In addition it was about time that we'd rename that file, it's not "mini" anymore and contains a bit more than just circular lists.	2020-06-11 10:18:56 +02:00
Willy Tarreau	f0f1c80daf	REORG: include: move istbuf.h to haproxy/ This one now relies on two files that were already cleaned up and is only used by buffer.h.	2020-06-11 10:18:56 +02:00
Willy Tarreau	8dabda7497	REORG: include: split buf.h into haproxy/buf-t.h and haproxy/buf.h File buf.h is one common cause of pain in the dependencies. Many files in the code need it to get the struct buffer definition, and a few also need the inlined functions to manipulate a buffer, but the file used to depend on a long chain only for BUG_ON() (addressed by last commit). Now buf.h is split into buf-t.h which only contains the type definitions, and buf.h for all inlined functions. Callers who don't care can continue to use buf.h but files in types/ must only use buf-t.h. sys/types.h had to be added to buf.h to get ssize_t as used by b_move(). It's worth noting that ssize_t is only supposed to be a size_t supporting -1, so b_move() ought to be rethought regarding this. The files were moved to haproxy/ and all their users were updated accordingly. A dependency issue was addressed on fcgi whose C file didn't include buf.h.	2020-06-11 10:18:56 +02:00
Willy Tarreau	025beea507	CLEANUP: debug: drop unused function p_malloc() This one was introduced 5 years ago for debugging and never really used. It is the one which used to cause circular dependencies issues. Let's drop it instead of starting to split the debug include in two.	2020-06-11 10:18:56 +02:00
Willy Tarreau	2a83d60662	REORG: include: move debug.h from common/ to haproxy/ The debug file is cleaner now and does not depend on much anymore.	2020-06-11 10:18:56 +02:00
Willy Tarreau	58017eef3f	REORG: include: move the BUG_ON() code to haproxy/bug.h This one used to be stored into debug.h but the debug tools got larger and require a lot of other includes, which can't use BUG_ON() anymore because of this. It does not make sense and instead this macro should be placed into the lower includes and given its omnipresence, the best solution is to create a new bug.h with the few surrounding macros needed to trigger bugs and place assertions anywhere. Another benefit is that it won't be required to add include <debug.h> anymore to use BUG_ON, it will automatically be covered by api.h. No less than 32 occurrences were dropped. The FSM_PRINTF macro was dropped since not used at all anymore (probably since 1.6 or so).	2020-06-11 10:18:56 +02:00
Willy Tarreau	eb6f701b99	REORG: include: move ist.h from common/ to import/ Fortunately that file wasn't made dependent upon haproxy since it was integrated, better isolate it before it's too late. Its dependency on api.h was the result of the change from config.h, which in turn wasn't correct. It was changed back to stddef.h for size_t and sys/types.h for ssize_t. The recently added reference to MAX() was changed as it was placed only to avoid a zero length in the non-free-standing version and was causing a build warning in the hpack encoder.	2020-06-11 10:18:56 +02:00
Willy Tarreau	6019faba50	REORG: include: move openssl-compat.h from common/ to haproxy/ This file is to openssl what compat.h is to the libc, so it makes sense to move it to haproxy/. It could almost be part of api.h but given the amount of openssl stuff that gets loaded I fear it could increase the build time. Note that this file contains lots of inlined functions. But since it does not depend on anything else in haproxy, it remains safe to keep all that together.	2020-06-11 10:18:56 +02:00
Willy Tarreau	8d36697dee	REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/ These ones do not depend on any other file. One used to include haproxy/api.h but that was solely for stddef.h.	2020-06-11 10:18:56 +02:00
Willy Tarreau	d678805783	REORG: include: move version.h to haproxy/ Few files were affected. The release scripts was updated.	2020-06-11 10:18:56 +02:00
Willy Tarreau	fd4bffe7c0	REORG: include: move the base files from common/ to haproxy/ The files currently covered by api-t.h and api.h (compat, compiler, defaults, initcall) are now located inside haproxy/.	2020-06-11 10:18:56 +02:00
Willy Tarreau	b9082a93e5	CLEANUP: include: remove unused common/tools.h Let's definitely get rid of this old file.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4d653a6285	REORG: include: move SWAP/MID_RANGE/MAX_RANGE from tools.h to standard.h Tools.h doesn't make sense for these 3 macros alone anymore, let's move them to standard.h which will ultimately become again tools.h once moved.	2020-06-11 10:18:56 +02:00
Willy Tarreau	5ae5006dde	REORG: include: move MIN/MAX from tools.h to compat.h Given that these macros are usually provided by sys/param.h, better move them to compat.h.	2020-06-11 10:18:56 +02:00
Willy Tarreau	57bb71e83a	CLEANUP: include: remove unused template.h There is one "template.h" per include subdirectory to show how to create a new file but in practice nobody knows they're here so they're useless. Let's simply remove them.	2020-06-11 10:18:56 +02:00
Willy Tarreau	86556a5377	CLEANUP: include: remove common/config.h It was already an indirection to load other files, it's not used anymore.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Willy Tarreau	7ab7031e34	REORG: include: create new file haproxy/api.h This file includes everything that must be guaranteed to be available to any buildable file in the project (including the contrib/ subdirs). For now it includes <haproxy/api-t.h> so that standard integer types and compiler macros are known, <common/initcall.h> to ease dynamic registration of init functions, and <common/tools.h> for a few MIN/MAX macros. version.h should probably also be added, though at the moment it doesn't bring a great value. All files which currently include the ones above should now switch to haproxy/api.h or haproxy/api-t.h instead. This should also reduce build time by having a single guard for several files at once.	2020-06-11 09:31:11 +02:00
Willy Tarreau	ca1765713b	REORG: include: create new file haproxy/api-t.h This file is at the lowest level of the include tree. Its purpose is to make sure that common types are known pretty much everywhere, particularly in structure declarations. It will essentially cover integer types such as uintXX_t via inttypes.h, "size_t" and "ptrdiff_t" via stddef.h, and various type modifiers such as __maybe_unused or ALIGN() via compiler.h, compat.h and defaults.h. It could be enhanced later if required, for example if some macros used to compute array sizes are needed.	2020-06-11 09:31:11 +02:00
Willy Tarreau	8d2b777fe3	REORG: ebtree: move the include files from ebtree to include/import/ This is where other imported components are located. All files which used to directly include ebtree were touched to update their include path so that "import/" is now prefixed before the ebtree-related files. The ebtree.h file was slightly adjusted to read compiler.h from the common/ subdirectory (this is the only change). A build issue was encountered when eb32sctree.h is loaded before eb32tree.h because only the former checks for the latter before defining type u32. This was addressed by adding the reverse ifdef in eb32tree.h. No further cleanup was done yet in order to keep changes minimal.	2020-06-11 09:31:11 +02:00
Christopher Faulet	89aed32bff	MINOR: mux-h1/proxy: Add a proxy option to disable clear h2 upgrade By default, HAProxy is able to implicitly upgrade an H1 client connection to an H2 connection if the first request it receives from a given HTTP connection matches the HTTP/2 connection preface. This way, it is possible to support H1 and H2 clients on a non-SSL connections. It could be a problem if for any reason, the H2 upgrade is not acceptable. "option disable-h2-upgrade" may now be used to disable it, per proxy. The main puprose of this option is to let an admin to totally disable the H2 support for security reasons. Recently, a critical issue in the HPACK decoder was fixed, forcing everyone to upgrade their HAProxy version to fix the bug. It is possible to disable H2 for SSL connections, but not on clear ones. This option would have been a viable workaround.	2020-06-03 10:23:39 +02:00
Willy Tarreau	39bd740d00	CLEANUP: regex: remove outdated support for regex actions The support for reqrep and friends was removed in 2.1 but the chain_regex() function and the "action" field in the regex struct was still there. This patch removes them. One point worth mentioning though. There is a check_replace_string() function whose purpose was to validate the replacement strings passed to reqrep. It should also be used for other replacement regex, but is never called. Callers of exp_replace() should be checked and a call to this function should be added to detect the error early.	2020-06-02 17:17:13 +02:00
Emeric Brun	975564784f	MEDIUM: ring: add new srv statement to support octet counting forward log-proto <logproto> The "log-proto" specifies the protocol used to forward event messages to a server configured in a ring section. Possible values are "legacy" and "octet-count" corresponding respectively to "Non-transparent-framing" and "Octet counting" in rfc6587. "legacy" is the default. Notes: a separated io_handler was created to avoid per messages test and to prepare code to set different log protocols such as request- response based ones.	2020-05-31 10:49:43 +02:00
Emeric Brun	494c505703	MEDIUM: ring: add server statement to forward messages from a ring This patch adds new statement "server" into ring section, and the related "timeout connect" and "timeout server". server <name> <address> [param*] Used to configure a syslog tcp server to forward messages from ring buffer. This supports for all "server" parameters found in 5.2 paragraph. Some of these parameters are irrelevant for "ring" sections. timeout connect <timeout> Set the maximum time to wait for a connection attempt to a server to succeed. Arguments : <timeout> is the timeout value specified in milliseconds by default, but can be in any other unit if the number is suffixed by the unit, as explained at the top of this document. timeout server <timeout> Set the maximum time for pending data staying into output buffer. Arguments : <timeout> is the timeout value specified in milliseconds by default, but can be in any other unit if the number is suffixed by the unit, as explained at the top of this document. Example: global log ring@myring local7 ring myring description "My local buffer" format rfc3164 maxlen 1200 size 32764 timeout connect 5s timeout server 10s server mysyslogsrv 127.0.0.1:6514	2020-05-31 10:46:13 +02:00
Emeric Brun	dcd58afaf1	MINOR: ring: re-work ring attach generic API. Attach is now independent on appctx, which was unused anyway.	2020-05-31 10:37:31 +02:00
Willy Tarreau	21072b9480	CLEANUP: pools: use the regular lock for the flush operation on lockless pools Commit `04f5fe87d3` introduced an rwlock in the pools to deal with the risk that pool_flush() dereferences an area being freed, and commit `899fb8abdc` turned it into a spinlock. The pools already contain a spinlock in case of locked pools, so let's use the same and simplify the code by removing ifdefs. At this point I'm really suspecting that if pool_flush() would instead rely on __pool_get_first() to pick entries from the pool, the concurrency problem could never happen since only one user would get a given entry at once, thus it could not be freed by another user. It's not certain this would be faster however because of the number of atomic ops to retrieve one entry compared to a locked batch.	2020-05-29 17:28:04 +02:00
Christopher Faulet	0bac4cdf1a	CLEANUP: http: Remove unused HTTP message templates HTTP_1XX, HTTP_3XX and HTTP_4XX message templates are no longer used. Only HTTP_302 and HTTP_303 are used during configuration parsing by "errorloc" family directives. So these templates are removed from the generic http code. And HTTP_302 and HTTP_303 templates are moved as static strings in the function parsing "errorloc" directives.	2020-05-28 15:07:20 +02:00
Christopher Faulet	b304883754	MINOR: http-rules: Use an action function to eval http-request auth rules Now http-request auth rules are evaluated in a dedicated function and no longer handled "in place" during the HTTP rules evaluation. Thus the action name ACT_HTTP_REQ_AUTH is removed. In additionn, http_reply_40x_unauthorized() is also removed. This part is now handled in the new action_ptr callback function.	2020-05-28 15:07:20 +02:00
Christopher Faulet	612f2eafe9	MINOR: http-ana: Use proxy's error replies to emit 401/407 responses There is no reason to not use proxy's error replies to emit 401/407 responses. The function http_reply_40x_unauthorized(), responsible to emit those responses, is not really complex. It only adds a WWW-Authenticate/Proxy-Authenticate header to a generic message. So now, error replies can be defined for 401 and 407 status codes, using errorfile or http-error directives. When an http-request auth rule is evaluated, the corresponding error reply is used. For 401 responses, all occurrences of the WWW-Authenticate header are removed and replaced by a new one with a basic authentication challenge for the configured realm. For 407 responses, the same is done on the Proxy-Authenticate header. If the error reply must not be altered, "http-request return" rule must be used instead.	2020-05-28 15:07:20 +02:00
Christopher Faulet	ae43b6c446	MINOR: http-ana: Make the function http_reply_to_htx() public This function may be used from anywhere to convert an HTTP reply to an HTX message.	2020-05-28 15:07:20 +02:00
Willy Tarreau	63a8738724	MEDIUM: pools: directly free objects when pools are too much crowded During pool_free(), when the ->allocated value is 125% of needed_avg or more, instead of putting the object back into the pool, it's immediately freed using free(). By doing this we manage to significantly reduce the amount of memory pinned in pools after transient traffic spikes. During a test involving a constant load of 100 concurrent connections each delivering 100 requests per second, the memory usage was a steady 21 MB RSS. Adding a 1 minute parallel load of 40k connections all looping on 100kB objects made the memory usage climb to 938 MB before this patch. With the patch it was only 660 MB. But when this parasit load stopped, before the patch the RSS would remain at 938 MB while with the patch, it went down to 480 then 180 MB after a few seconds, to stabilize around 69 MB after about 20 seconds. This can be particularly important to improve reloads where the memory has to be shared between the old and new process. Another improvement would be welcome, we ought to have a periodic task to check pools usage and continue to free up unused objects regardless of any call to pool_free(), because the needed_avg value depends on the past and will not cover recently refilled objects.	2020-05-27 08:32:42 +02:00
Willy Tarreau	a1e4f8c27c	MINOR: pools: compute an estimate of each pool's average needed objects This adds a sliding estimate of the pools' usage. The goal is to be able to use this to start to more aggressively free memory instead of keeping lots of unused objects in pools. The average is calculated as a sliding average over the last 1024 consecutive measures of ->used during calls to pool_free(), and is bumped up for 1/4 of its history from ->allocated when allocation from the pool fails and results in a call to malloc(). The result is a floating value between ->used and ->allocated, that tries to react fast to under-estimates that result in expensive malloc() but still maintains itself well in case of stable usage, and progressively goes down if usage shrinks over time. This new metric is reported as "needed_avg" in "show pools". Sadly due to yet another include dependency hell, we couldn't reuse the functions from freq_ctr.h so they were temporarily duplicated into memory.h.	2020-05-27 08:32:42 +02:00
Emeric Brun	99c453df9d	MEDIUM: ring: new section ring to declare custom ring buffers. It is possible to globally declare ring-buffers, to be used as target for log servers or traces. ring <ringname> Creates a new ring-buffer with name <ringname>. description <text> The descritpition is an optional description string of the ring. It will appear on CLI. By default, <name> is reused to fill this field. format <format> Format used to store events into the ring buffer. Arguments: <format> is the log format used when generating syslog messages. It may be one of the following : iso A message containing only the ISO date, followed by the text. The PID, process name and system name are omitted. This is designed to be used with a local log server. raw A message containing only the text. The level, PID, date, time, process name and system name are omitted. This is designed to be used in containers or during development, where the severity only depends on the file descriptor used (stdout/stderr). This is the default. rfc3164 The RFC3164 syslog message format. This is the default. (https://tools.ietf.org/html/rfc3164) rfc5424 The RFC5424 syslog message format. (https://tools.ietf.org/html/rfc5424) short A message containing only a level between angle brackets such as '<3>', followed by the text. The PID, date, time, process name and system name are omitted. This is designed to be used with a local log server. This format is compatible with what the systemd logger consumes. timed A message containing only a level between angle brackets such as '<3>', followed by ISO date and by the text. The PID, process name and system name are omitted. This is designed to be used with a local log server. maxlen <length> The maximum length of an event message stored into the ring, including formatted header. If an event message is longer than <length>, it will be truncated to this length. size <size> This is the optional size in bytes for the ring-buffer. Default value is set to BUFSIZE. Example: global log ring@myring local7 ring myring description "My local buffer" format rfc3164 maxlen 1200 Note: ring names are resolved during post configuration processing.	2020-05-26 08:03:15 +02:00
Tim Duesterhus	b4fac1eb3c	MINOR: vars: Make vars_(un\|)set_by_name(_ifexist\|) return a success value Change the return type from `void` to `int` and return whether setting the variable was successful.	2020-05-25 08:12:27 +02:00
Tim Duesterhus	7329327333	CLEANUP: vars: Remove void vars_unset_by_name(const char, size_t, struct sample) With "MINOR: lua: Use vars_unset_by_name_ifexist()" the last user was removed and as outlined in that commit there is no good reason for this function to exist. May be backported together with the commit mentioned above.	2020-05-25 08:12:23 +02:00
Willy Tarreau	0ff9b3d64f	BUILD: hpack: make sure the hpack table can still be built standalone Recent commit `2bdcc70fa7` ("MEDIUM: hpack: use a pool for the hpack table") made the hpack code finally use a pool with very unintrusive code that was assumed to be trivial enough to adjust if the code needed to be reused outside of haproxy. Unfortunately the code in contrib/hpack already uses it and broke the oss-fuzz tests as it doesn't build anymore. This patch adds an HPACK_STANDALONE macro to decide if we should use the pools or malloc+free. The resulting macros are called hpack_alloc() and hpack_free() respectively, and the size must be passed into the pool itself.	2020-05-22 12:13:43 +02:00
Christopher Faulet	3b967c1210	MINOR: http-htx/proxy: Add http-error directive using http return syntax The http-error directive can now be used instead of errorfile to define an error message in a proxy section (including default sections). This directive uses the same syntax that http return rules. The only real difference is the limitation on status code that may be specified. Only status codes supported by errorfile directives are supported for this new directive. Parsing of errorfile directive remains independent from http-error parsing. But functionally, it may be expressed in terms of http-errors : errorfile <status> <file> ==> http-errror status <status> errorfile <file>	2020-05-20 18:27:14 +02:00
Christopher Faulet	963ce5bc06	CLEANUP: channel: Remove channel_htx_copy_msg() function This function is now unused. So it is removed.	2020-05-20 18:27:14 +02:00
Christopher Faulet	2056736453	MINOR: htx: Add a function to copy a buffer in an HTX message The htx_copy_msg() function can now be used to copy the HTX message stored in a buffer in an existing HTX message. It takes care to not overwrite existing data. If the destination message is empty, a raw copy is performed. All the message is copied or nothing. This function is used instead of channel_htx_copy_msg().	2020-05-20 18:27:14 +02:00
Christopher Faulet	f1fedc3cce	CLEANUP: http-htx: Remove unused storage of error messages in buffers Now, error messages are all stored in http replies. So the storage as a buffer can safely be removed.	2020-05-20 18:27:14 +02:00
Christopher Faulet	8dfeccf6d3	MEDIUM: http-ana: Use http replies for HTTP error messages When HAProxy returns an http error message, the corresponding http reply is now used instead of the buffer containing the corresponding HTX message. So, http_error_message() function now returns the http reply to use for a given stream. And the http_reply_and_close() function now relies on http_reply_message() to send the response to the client.	2020-05-20 18:27:14 +02:00
Christopher Faulet	507479b096	MINOR: http-ana: Use a TXN flag to prevent after-response ruleset evaluation The txn flag TX_CONST_REPLY may now be used to prevent after-response ruleset evaluation. It is used if this ruleset evaluation failed on an internal error response. Before, it was done incrementing the parameter <final>. But it is not really convenient if an intermediary function is used to produce the response. Using a txn flag could also be a good way to prevent after-response ruleset evaluation in a different context.	2020-05-20 18:27:13 +02:00
Christopher Faulet	e29a97e51a	MINOR: http-htx: Use http reply from the http-errors section When an http reply is configured to use an error message from an http-errors section, instead of referencing the error message, the http reply is used. To do so the new http reply type HTTP_REPLY_INDIRECT has been added.	2020-05-20 18:27:13 +02:00
Christopher Faulet	40e8569676	MINOR: proxy: Add references on http replies for proxy error messages Error messages defined in proxy section or inherited from a default section are now also referenced using an array of http replies. This is done during the configuration validity check.	2020-05-20 18:27:13 +02:00
Christopher Faulet	5809e10b48	MINOR: http-htx: Store errorloc/errorfile messages in http replies During configuration parsing, error messages resulting of parsing of errorloc and errorfile directives are now also stored as an http reply. So, for now, these messages are stored as a buffer and as an http reply. To be able to release all these http replies when haproxy is stopped, a global list is used. We must do that because the same http reply may be referenced several times by different proxies if it is defined in a default section.	2020-05-20 18:27:13 +02:00
Christopher Faulet	de30bb7245	MINOR: http-htx: Store messages of an http-errors section in a http reply array Error messages specified in an http-errors section is now also stored in an array of http replies. So, for now, these messages are stored as a buffer and as a http reply.	2020-05-20 18:27:13 +02:00
Christopher Faulet	1b13ecaca2	MINOR: http-htx: Store default error messages in a global http reply array Default error messages are stored as a buffer, in http_err_chunks global array. Now, they are also stored as a http reply, in http_err_replies global array.	2020-05-20 18:27:13 +02:00
Christopher Faulet	5cb513abeb	MEDIUM: http-rules: Rely on http reply for http deny/tarpit rules "http-request deny", "http-request tarpit" and "http-response deny" rules now use the same syntax than http return rules and internally rely on the http replies. The behaviour is not the same when no argument is specified (or only the status code). For http replies, a dummy response is produced, with no payload. For old deny/tarpit rules, the proxy's error messages are used. Thus, to be compatible with existing configuration, the "default-errorfiles" parameter is implied. For instance : http-request deny deny_status 404 is now an alias of http-request deny status 404 default-errorfiles	2020-05-20 18:27:13 +02:00
Christopher Faulet	0e2ad61315	MINOR: http-ana: Use a dedicated function to send a response from an http reply The http_reply_message() function may be used to send an http reply to a client. This function is responsile to convert the reply in HTX, to push it in the response buffer and to forward it to the client. It is also responsible to terminate the transaction. This function is used during evaluation of http return rules.	2020-05-20 18:27:13 +02:00
Christopher Faulet	7eea241c39	MINOR: http-htx: Use a dedicated function to check http reply validity A dedicated function is added to check the validity of an http reply object, after parsing. It is used to check the validity of http return rules. For now, this function is only used to find the right error message in an http-errors section for http replies of type HTTP_REPLY_ERRFILES (using "errorfiles" argument). On success, such replies are updated to point on the corresponding error message and their type is set to HTTP_REPLY_ERRMSG. If an unknown http-errors section is referenced, anx error is returned. If a unknown error message is referenced inside an existing http-errors section, a warning is emitted and the proxy's error messages are used instead.	2020-05-20 18:27:13 +02:00
Christopher Faulet	47e791e220	MINOR: http-htx: Use a dedicated function to parse http reply arguments A dedicated function to parse arguments and create an http_reply object is added. It is used to parse http return rule. Thus, following arguments are parsed by this function : ... [status <code>] [content-type <type>] [ { default-errorfiles \| errorfile <file> \| errorfiles <name> \| file <file> \| lf-file <file> \| string <str> \| lf-string <fmt> } ] [ hdr <name> <fmt> ]* Because the status code argument is optional, a default status code must be defined when this function is called.	2020-05-20 18:27:13 +02:00
Christopher Faulet	18630643a9	MINOR: http-htx: Use a dedicated function to release http_reply objects A function to release an http_reply object has been added. It is now called when an http return rule is released.	2020-05-20 18:27:13 +02:00
Christopher Faulet	5ff0c64921	MINOR: http-rules: Use http_reply structure for http return rules No real change here. Instead of using an internal structure to the action rule, the http return rules are now stored as an http reply. The main change is about the action type. It is now always set to ACT_CUSTOM. The http reply type is used to know how to evaluate the rule.	2020-05-20 18:27:13 +02:00
Christopher Faulet	b6ea17c6fc	CLEANUP: http-htx: Rename http_error structure into http_error_msg The structure owns an error message, most of time loaded from a file, and converted to HTX. It is created when an errorfile or errorloc directive is parsed. It is renamed to avoid ambiguities with http_reply structure.	2020-05-20 18:27:13 +02:00
Christopher Faulet	7bd3de06e7	MINOR: http-htx: Add http_reply type based on what is used for http return rules The http_reply structure is added. It represents a generic HTTP message used as internal response by HAProxy. It is based on the structure used to store http return rules. The aim is to store all error messages using this structure, as well as http return and http deny rules.	2020-05-20 18:27:13 +02:00
Christopher Faulet	a53abad42d	CLEANUP: http_ana: Remove unused TXN flags TX_CLDENY, TX_CLALLOW, TX_SVDENY and TX_SVALLOW flags are unused. Only TX_CLTARPIT is used to make the difference between an http deny rule and an http tarpit rule. So these unused flags are removed.	2020-05-20 18:27:13 +02:00
William Lallemand	8177ad9895	MINOR: ssl: split config and runtime variable for ssl-{min,max}-ver In the CLI command 'show ssl crt-list', the ssl-min-ver and the ssl-min-max arguments were always displayed because the dumped versions were the actual version computed and used by haproxy, instead of the version found in the configuration. To fix the problem, this patch separates the variables to have one with the configured version, and one with the actual version used. The dump only shows the configured version.	2020-05-20 16:49:02 +02:00
Willy Tarreau	d68a6927f7	Revert "MEDIUM: sink: add global statement to create a new ring (sink buffer)" This reverts commit `957ec59571`. As discussed with Emeric, the current syntax is not extensible enough, this will be turned to a section instead in a forthcoming patch.	2020-05-20 12:06:16 +02:00
Willy Tarreau	928068a74b	MINOR: ring: make the applet code not depend on the CLI The ring to applet communication was only made to deal with CLI functions but it's generic. Let's have generic appctx functions and have the CLI rely on these instead. This patch introduces ring_attach_appctx() and ring_detach_appctx().	2020-05-19 19:37:12 +02:00
Willy Tarreau	9597cbd17a	MINOR: applet: adopt the wait list entry from the CLI A few fields, including a generic list entry, were added to the CLI context by commit `300decc8d9` ("MINOR: cli: extend the CLI context with a list and two offsets"). It turns out that the list entry (l0) is solely used to consult rings and that the generic ring_write() code is restricted to a consumer on the CLI due to this, which was not the initial intent. Let's make it a general purpose wait_entry field that is properly initialized during appctx_init(). This will allow any applet to wait on a ring, not just the CLI.	2020-05-19 19:37:12 +02:00
Willy Tarreau	2bdcc70fa7	MEDIUM: hpack: use a pool for the hpack table Instead of using malloc/free to allocate an HPACK table, let's declare a pool. However the HPACK size is configured by the H2 mux, so it's also this one which allocates it after post_check.	2020-05-19 11:40:39 +02:00
Emeric Brun	957ec59571	MEDIUM: sink: add global statement to create a new ring (sink buffer) This patch adds the new global statement: ring <name> [desc <desc>] [format <format>] [size <size>] [maxlen <length>] Creates a named ring buffer which could be used on log line for instance. <desc> is an optionnal description string of the ring. It will appear on CLI. By default, <name> is reused to fill this field. <format> is the log format used when generating syslog messages. It may be one of the following : iso A message containing only the ISO date, followed by the text. The PID, process name and system name are omitted. This is designed to be used with a local log server. raw A message containing only the text. The level, PID, date, time, process name and system name are omitted. This is designed to be used in containers or during development, where the severity only depends on the file descriptor used (stdout/stderr). This is the default. rfc3164 The RFC3164 syslog message format. This is the default. (https://tools.ietf.org/html/rfc3164) rfc5424 The RFC5424 syslog message format. (https://tools.ietf.org/html/rfc5424) short A message containing only a level between angle brackets such as '<3>', followed by the text. The PID, date, time, process name and system name are omitted. This is designed to be used with a local log server. This format is compatible with what the systemd logger consumes. timed A message containing only a level between angle brackets such as '<3>', followed by ISO date and by the text. The PID, process name and system name are omitted. This is designed to be used with a local log server. <length> is the maximum length of event message stored into the ring, including formatted header. If the event message is longer than <length>, it would be truncated to this length. <name> is the ring identifier, which follows the same naming convention as proxies and servers. <size> is the optionnal size in bytes. Default value is set to BUFSIZE. Note: Historically sink's name and desc were refs on const strings. But with new configurable rings a dynamic allocation is needed.	2020-05-19 11:04:11 +02:00
Emeric Brun	e709e1e777	MEDIUM: logs: buffer targets now rely on new sink_write Before this path, they rely directly on ring_write bypassing a part of the sink API. Now the maxlen parameter of the log will apply only on the text message part (and not the header, for this you woud prefer to use the maxlen parameter on the sink/ring). sink_write prototype was also reviewed to return the number of Bytes written to be compliant with the other write functions.	2020-05-19 11:04:11 +02:00
Emeric Brun	bd163817ed	MEDIUM: sink: build header in sink_write for log formats This patch extends the sink_write prototype and code to handle the rfc5424 and rfc3164 header. It uses header building tools from log.c. Doing this some functions/vars have been externalized. facility and minlevel have been removed from the struct sink and passed to args at sink_write because they depends of the log and not of the sink (they remained unused by rest of the code until now).	2020-05-19 11:04:11 +02:00
William Dauchy	1665c43fd8	BUILD: ssl: include buffer common headers for ssl_sock_ctx since commit `c0cdaffaa3` ("REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues"), `struct ssl_sock_ctx` was moved in ssl_sock.h. As it contains a `struct buffer`, including `common/buffer.h` is now mandatory. I encountered an issue while including ssl_sock.h on another patch: include/types/ssl_sock.h:240:16: error: field ‘early_buf’ has incomplete type 240 \| struct buffer early_buf; /* buffer to store the early data received */ no backport needed. Fixes: `c0cdaffaa3` ("REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues") Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-05-18 08:29:32 +02:00
Marcin Deranek	4dc2b57d51	MINOR: stats: Prepare for more accurate moving averages Add swrate_add_dynamic function which is similar to swrate_add, but more accurate when calculating moving averages when not enough samples have been processed yet.	2020-05-16 22:40:00 +02:00
William Lallemand	6a66a5ec9b	REORG: ssl: move utility functions to src/ssl_utils.c These functions are mainly used to extract information from certificates.	2020-05-15 14:11:54 +02:00
William Lallemand	15e169447d	REORG: ssl: move sample fetches to src/ssl_sample.c Move all SSL sample fetches to src/ssl_sample.c.	2020-05-15 14:11:54 +02:00
William Lallemand	c0cdaffaa3	REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues In order to move all SSL sample fetches in another file, moving the ssl_sock_ctx definition in a .h file is required. Unfortunately it became a cross dependencies hell to solve, because of the struct wait_event field, so <types/connection.h> is needed which created other problems.	2020-05-15 14:11:54 +02:00
William Lallemand	ef76107a4b	MINOR: ssl: remove static keyword in some SSL utility functions In order to move the the sample fetches to another file, remove the static keyword of some utility functions in the SSL fetches.	2020-05-15 14:11:54 +02:00
William Lallemand	dad3105157	REORG: ssl: move ssl configuration to cfgparse-ssl.c Move all the configuration parsing of the ssl keywords in cfgparse-ssl.c	2020-05-15 14:11:54 +02:00
William Lallemand	da8584c1ea	REORG: ssl: move the CLI 'cert' functions to src/ssl_ckch.c Move the 'ssl cert' CLI functions to src/ssl_ckch.c.	2020-05-15 14:11:54 +02:00
William Lallemand	c756bbd3df	REORG: ssl: move the crt-list CLI functions in src/ssl_crtlist.c Move the crtlist functions for the CLI to src/ssl_crtlist.c	2020-05-15 14:11:54 +02:00
William Lallemand	03c331c80a	REORG: ssl: move the ckch_store related functions to src/ssl_ckch.c Move the cert_key_and_chain functions: int ssl_sock_load_files_into_ckch(const char path, struct cert_key_and_chain ckch, char *err); int ssl_sock_load_pem_into_ckch(const char path, char buf, struct cert_key_and_chain ckch , char *err); void ssl_sock_free_cert_key_and_chain_contents(struct cert_key_and_chain ckch); int ssl_sock_load_key_into_ckch(const char path, char buf, struct cert_key_and_chain ckch , char err); int ssl_sock_load_ocsp_response_from_file(const char ocsp_path, char buf, struct cert_key_and_chain ckch, char *err); int ssl_sock_load_sctl_from_file(const char sctl_path, char buf, struct cert_key_and_chain ckch, char *err); int ssl_sock_load_issuer_file_into_ckch(const char path, char buf, struct cert_key_and_chain ckch, char *err); And the utility ckch_store functions: void ckch_store_free(struct ckch_store store) struct ckch_store ckch_store_new(const char filename, int nmemb) struct ckch_store ckchs_dup(const struct ckch_store src) ckch_store ckchs_lookup(char path) ckch_store ckchs_load_cert_file(char path, int multi, char **err)	2020-05-15 14:11:54 +02:00
William Lallemand	c1c50b46e9	CLEANUP: ssl: avoid circular dependencies in ssl_crtlist.h Add forward declarations in types/ssl_crtlist.h in order to avoid circular dependencies. Also remove the listener.h include which is not needed anymore.	2020-05-15 14:11:54 +02:00
William Lallemand	6e9556b635	REORG: ssl: move crtlist functions to src/ssl_crtlist.c Move the crtlist functions to src/ssl_crtlist.c and their definitions to proto/ssl_crtlist.h. The following functions were moved: /* crt-list entry functions / void ssl_sock_free_ssl_conf(struct ssl_bind_conf conf); char crtlist_dup_filters(char args, int fcount); void crtlist_free_filters(char *args); void crtlist_entry_free(struct crtlist_entry entry); struct crtlist_entry crtlist_entry_new(); / crt-list functions / void crtlist_free(struct crtlist crtlist); struct crtlist crtlist_new(const char filename, int unique); /* file loading / int crtlist_parse_line(char line, char *crt_path, struct crtlist_entry entry, const char file, int linenum, char err); int crtlist_parse_file(char file, struct bind_conf bind_conf, struct proxy curproxy, struct crtlist crtlist, char err); int crtlist_load_cert_dir(char path, struct bind_conf bind_conf, struct crtlist crtlist, char err);	2020-05-15 14:11:54 +02:00
William Lallemand	c69973f7eb	CLEANUP: ssl: add ckch prototypes in proto/ssl_ckch.h Remove the static definitions of the ckch functions and add them to ssl_ckch.h in order to use them outside ssl_sock.c.	2020-05-15 14:11:54 +02:00
William Lallemand	d4632b2b6d	REORG: ssl: move the ckch structures to types/ssl_ckch.h Move all the structures used for loading the SSL certificates in ssl_ckch.h	2020-05-15 14:11:54 +02:00
William Lallemand	be21b663cd	REORG: move the crt-list structures in their own .h Move the structure definitions specifics to the crt-list in types/ssl_crtlist.h.	2020-05-15 14:11:54 +02:00
William Lallemand	7fd8b4567e	REORG: ssl: move macros and structure definitions to ssl_sock.h The ssl_sock.c file contains a lot of macros and structure definitions that should be in a .h. Move them to the more appropriate types/ssl_sock.h file.	2020-05-15 14:11:54 +02:00
Dragan Dosen	eb607fe6a1	MINOR: ssl: add a new function ssl_sock_get_ssl_object() This one can be used later to get a SSL object from connection. It will return NULL if connection is not established over SSL.	2020-05-14 13:13:14 +02:00
Dragan Dosen	1e7ed04665	MEDIUM: ssl: allow to register callbacks for SSL/TLS protocol messages This patch adds the ability to register callbacks for SSL/TLS protocol messages by using the function ssl_sock_register_msg_callback(). All registered callback functions will be called when observing received or sent SSL/TLS protocol messages.	2020-05-14 13:13:14 +02:00
Christopher Faulet	325504cf89	BUG/MINOR: sample/ssl: Fix digest converter for openssl < 1.1.0 The EVP_MD_CTX_create() and EVP_MD_CTX_destroy() functions were renamed to EVP_MD_CTX_new() and EVP_MD_CTX_free() in OpenSSL 1.1.0, respectively. These functions are used by the digest converter, introduced by the commit `8e36651ed` ("MINOR: sample: Add digest and hmac converters"). So for prior versions of openssl, macros are used to fallback on old functions. This patch must only be backported if the commit `8e36651ed` is backported too.	2020-05-12 16:30:41 +02:00
Willy Tarreau	5778fea4da	CLEANUP: remove THREAD_LOCAL from config.h This one really ought to be defined in hathreads.h like all other thread definitions, which is what this patch does. As expected, all files but one (regex.h) were already including hathreads.h when using THREAD_LOCAL; regex.h was fixed for this. This was the last entry in config.h which is now useless.	2020-05-09 09:08:09 +02:00
Willy Tarreau	3bc4e8bfe6	CLENAUP: config: move CONFIG_HAP_LOCKLESS_POOLS out of config.h The setting of CONFIG_HAP_LOCKLESS_POOLS depending on threads and compat was done in config.h for use only in memory.h and memory.c where other settings are dealt with. Further, the default pool cache size was set there from a fixed value instead of being set from defaults.h Let's move the decision to enable lockless pools via CONFIG_HAP_LOCKLESS_POOLS to memory.h, and set the default pool cache size in defaults.h like other default settings. This was the next-to-last setting in config.h.	2020-05-09 09:02:35 +02:00
Willy Tarreau	755afc08d5	CLEANUP: config: drop unused setting CONFIG_HAP_INLINE_FD_SET CONFIG_HAP_INLINE_FD_SET was introduced in 1.3.3 and dropped in 1.3.9 when the pollers were reworked, let's remove it.	2020-05-09 08:57:48 +02:00
Willy Tarreau	571eb3d659	CLEANUP: config: drop unused setting CONFIG_HAP_MEM_OPTIM CONFIG_HAP_MEM_OPTIM was introduced with memory pools in 1.3 and dropped in 1.6 when pools became the only way to allocate memory. Still the option remained present in config.h. Let's kill it.	2020-05-09 08:53:31 +02:00
Christopher Faulet	67a234583e	CLEANUP: checks: sort and rename tcpcheck_expect_type types The same naming format is used for all expect rules. And names are sorted to be grouped by type.	2020-05-06 12:38:44 +02:00
Christopher Faulet	aaab0836d9	MEDIUM: checks: Add matching on log-format string for expect rules It is now possible to use log-format string (or hexadecimal string for the binary version) to match a content in tcp-check based expect rules. For hexadecimal log-format string, the conversion in binary is performed after the string evaluation, during health check execution. The pattern keywords to use are "string-lf" for the log-format string and "binary-lf" for the hexadecimal log-format string.	2020-05-06 08:31:29 +02:00
Willy Tarreau	a4d9ee3d1c	BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}() Just like in previous patch, it happens that HA_ATOMIC_UPDATE_MIN() and HA_ATOMIC_UPDATE_MAX() would evaluate the (val) argument up to 3 times. However this time it affects both thread and non-thread versions. It's strange because the copy was properly performed for the (new) argument in order to avoid this. Anyway it was done for the "val" one as well. A quick code inspection showed that this currently has no effect as these macros are fairly limited in usage. It would be best to backport this for long-term stability (till 1.8) but it will not fix an existing bug.	2020-05-05 16:18:52 +02:00
Willy Tarreau	d66345d6b0	BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_CAS() When threads are disabled, HA_ATOMIC_CAS() becomes a simple compound expression. However this expression presents a problem, which is that its arguments are evaluated multiple times, once for the comparison and once again for the assignement. This presents a risk of performing some side-effect operations twice in the non-threaded case (e.g. in case of auto-increment or function return). The macro was rewritten using local copies for arguments like the other macros do. Fortunately a complete inspection of the code indicates that this case currently never happens. It was however responsible for the strict-aliasing warning emitted when building fd.c without threads but with 64-bit CAS. This may be backported as far as 1.8 though it will not fix any existing bug and is more of a long-term safety measure in case a future fix would depend on this behavior.	2020-05-05 16:05:45 +02:00
Baptiste Assmann	0e9d87bf06	MINOR: istbuf: add ist2buf() function Purpose of this function is to build a <struct buffer> from a <struct ist>.	2020-05-05 15:28:59 +02:00
Baptiste Assmann	de80201460	MINOR: ist: add istissame() function The istissame() function takes 2 ist and compare their <.ptr> and <.len> values respectively. It returns non-zero if they are the same.	2020-05-05 15:28:59 +02:00
Baptiste Assmann	9ef1967af7	MINOR: ist: add istadv() function The purpose of istadv() function is to move forward <.ptr> by <nb> characters. It is very useful when parsing a payload.	2020-05-05 15:28:59 +02:00
Christopher Faulet	3970819a55	MEDIUM: checks: Support matching on headers for http-check expect rules It is now possible to add http-check expect rules matching HTTP header names and values. Here is the format of these rules: http-check expect header name [ -m <meth> ] <name> [log-format] \ [ value [ -m <meth> ] <value> [log-format] [full] ] the name pattern (name ...) is mandatory but the value pattern (value ...) is optionnal. If not specified, only the header presence is verified. <meth> is the matching method, applied on the header name or the header value. Supported matching methods are: * "str" (exact match) * "beg" (prefix match) * "end" (suffix match) * "sub" (substring match) * "reg" (regex match) If not specified, exact matching method is used. If the "log-format" option is used, the pattern (<name> or <value>) is evaluated as a log-format string. This option cannot be used with the regex matching method. Finally, by default, the header value is considered as comma-separated list. Each part may be tested. The "full" option may be used to test the full header line. Note that matchings are case insensitive on the header names.	2020-05-05 11:19:27 +02:00
Christopher Faulet	8dd33e13a5	MINOR: http-htx: Support different methods to look for header names It is now possible to use different matching methods to look for header names in an HTTP message: * The exact match. It is the default method. http_find_header() uses this method. http_find_str_header() is an alias. * The prefix match. It evals the header names starting by a prefix. http_find_pfx_header() must be called to use this method. * The suffix match. It evals the header names ending by a suffix. http_find_sfx_header() must be called to use this method. * The substring match. It evals the header names containing a string. http_find_sub_header() must be called to use this method. * The regex match. It evals the header names matching a regular expression. http_match_header() must be called to use this method.	2020-05-05 11:07:00 +02:00
Christopher Faulet	778f5ed478	MEDIUM: checks/http-fetch: Support htx prefetch from a check for HTTP samples Some HTTP sample fetches will be accessible from the context of a http-check health check. Thus, the prefetch function responsible to return the HTX message has been update to handle a check, in addition to a channel. Both cannot be used at the same time. So there is no ambiguity.	2020-05-05 11:06:43 +02:00
Willy Tarreau	86c6a9221a	BUG/MEDIUM: shctx: bound the number of loops that can happen around the lock Given that a "count" value of 32M was seen in _shctx_wait4lock(), it is very important to prevent this from happening again. It's absolutely essential to prevent the value from growing unbounded because with an increase of the number of threads, the number of successive failed attempts will necessarily grow. Instead now we're scanning all 2^p-1 values from 3 to 255 and are bounding to count to 255 so that in the worst case each thread tries an xchg every 255 failed read attempts. That's one every 4 on average per thread when there are 64 threads, which corresponds to the initial count of 4 for the first attempt so it seems like a reasonable value to keep a low latency. The bug was introduced with the shctx entries in 1.5 so the fix must be backported to all versions. Before 1.8 the function was called _shared_context_wait4lock() and was in shctx.c.	2020-05-01 13:32:20 +02:00
Willy Tarreau	3801bdc3fc	BUG/MEDIUM: shctx: really check the lock's value while waiting J�r�me reported an amazing crash in the spinlock version of _shctx_wait4lock() with an extremely high <count> value of 32M! The root cause is that the function cannot deal with contention on the lock at all because it forgets to check if the lock's value has changed! As such, every time it's called due to a contention, it waits twice as long before trying again and lets the caller check for the contention by itself. The correct thing to do is to compare the value again at each loop. This way it makes sure to mostly perform read accesses on the shared cache line without writing too often, and to be ready fast enough to try to grab the lock. And we must not increase the count on success either! Unfortunately I'd have expected to see a performance boost on the cache with this but there was absolutely no change, so it's very likely that these issues only happen once in a while and are sufficient to derail the process when they strike, but not to have a permanent performance impact. The bug was introduced with the shctx entries in 1.5 so the fix must be backported to all versions. Before 1.8 the function was called _shared_context_wait4lock() and was in shctx.c.	2020-05-01 13:29:14 +02:00
Willy Tarreau	f0e5da20e1	BUG/MINOR: debug: properly use long long instead of long for the thread ID I changed my mind twice on this one and pushed after the last test with threads disabled, without re-enabling long long, causing this rightful build warning. This needs to be backported if the previous commit `ff64d3b027` ("MINOR: threads: export the POSIX thread ID in panic dumps") is backported as well.	2020-05-01 12:26:03 +02:00
Willy Tarreau	ff64d3b027	MINOR: threads: export the POSIX thread ID in panic dumps It is very difficult to map a panic dump against a gdb thread dump because the thread numbers do not match. However gdb provides the pthread ID but this one is supposed to be opaque and not to be cast to a scalar. This patch provides a fnuction, ha_get_pthread_id() which retrieves the pthread ID of the indicated thread and casts it to an unsigned long long so as to lose the least possible amount of information from it. This is done cleanly using a union to maintain alignment so as long as these IDs are stored on 1..8 bytes they will be properly reported. This ID is now presented in the panic dumps so it now becomes possible to map these threads. When threads are disabled, zero is returned. For example, this is a panic dump: Thread 1 is about to kill the process. >Thread 1 : id=0x7fe92b825180 act=0 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0 stuck=1 prof=0 harmless=0 wantrdv=0 cpu_ns: poll=5119122 now=2009446995 diff=2004327873 curr_task=0xc99bf0 (task) calls=4 last=0 fct=0x592440(task_run_applet) ctx=0xca9c50(<CLI>) strm=0xc996a0 src=unix fe=GLOBAL be=GLOBAL dst=<CLI> rqf=848202 rqa=0 rpf=80048202 rpa=0 sif=EST,200008 sib=EST,204018 af=(nil),0 csf=0xc9ba40,8200 ab=0xca9c50,4 csb=(nil),0 cof=0xbf0e50,1300:PASS(0xc9cee0)/RAW((nil))/unix_stream(20) cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0) call trace(20): \| 0x59e4cf [48 83 c4 10 5b 5d 41 5c]: wdt_handler+0xff/0x10c \| 0x7fe92c170690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690 \| 0x7ffce29519d9 [48 c1 e2 20 48 09 d0 48]: linux-vdso:+0x9d9 \| 0x7ffce2951d54 [eb d9 f3 90 e9 1c ff ff]: linux-vdso:__vdso_gettimeofday+0x104/0x133 \| 0x57b484 [48 89 e6 48 8d 7c 24 10]: main+0x157114 \| 0x50ee6a [85 c0 75 76 48 8b 55 38]: main+0xeaafa \| 0x50f69c [48 63 54 24 20 85 c0 0f]: main+0xeb32c \| 0x59252c [48 c7 c6 d8 ff ff ff 44]: task_run_applet+0xec/0x88c Thread 2 : id=0x7fe92b6e6700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=0 prof=0 harmless=1 wantrdv=0 cpu_ns: poll=786738 now=1086955 diff=300217 curr_task=0 Thread 3 : id=0x7fe92aee5700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=0 prof=0 harmless=1 wantrdv=0 cpu_ns: poll=828056 now=1129738 diff=301682 curr_task=0 Thread 4 : id=0x7fe92a6e4700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=0 prof=0 harmless=1 wantrdv=0 cpu_ns: poll=818900 now=1153551 diff=334651 curr_task=0 And this is the gdb output: (gdb) info thr Id Target Id Frame 1 Thread 0x7fe92b825180 (LWP 15234) 0x00007fe92ba81d6b in raise () from /lib64/libc.so.6 2 Thread 0x7fe92b6e6700 (LWP 15235) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6 3 Thread 0x7fe92a6e4700 (LWP 15237) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6 4 Thread 0x7fe92aee5700 (LWP 15236) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6 We can clearly see that while threads 1 and 2 are the same, gdb's threads 3 and 4 respectively are haproxy's threads 4 and 3. This may be backported to 2.0 as it removes some confusion in github issues.	2020-05-01 11:45:56 +02:00
Christopher Faulet	dc75d577b9	CLEANUP: checks: Fix checks includes	2020-04-29 13:32:29 +02:00
Christopher Faulet	1543d44607	MINOR: http-htx: Export functions to update message authority and host These functions will be used by HTTP health checks when a request is formatted before sending it.	2020-04-29 13:32:29 +02:00
Damien Claisse	57c8eb939d	MINOR: log: Add "Tu" timer It can be sometimes useful to measure total time of a request as seen from an end user, including TCP/TLS negotiation, server response time and transfer time. "Tt" currently provides something close to that, but it also takes client idle time into account, which is problematic for keep-alive requests as idle time can be very long. "Ta" is also not sufficient as it hides TCP/TLS negotiationtime. To improve that, introduce a "Tu" timer, without idle time and everything else. It roughly estimates time spent time spent from user point of view (without DNS resolution time), assuming network latency is the same in both directions.	2020-04-28 16:30:13 +02:00
Christopher Faulet	bfb0f72d52	BUG/MEDIUM: sessions: Always pass the mux context as argument to destroy a mux This bug was introduced by the commit `2444aa5b` ("MEDIUM: sessions: Don't be responsible for connections anymore."). In session_check_idle_conn(), when the mux is destroyed, its context must be passed as argument instead of the connection. It is de 2.2-dev bug. No need to backport.	2020-04-27 15:53:43 +02:00
Christopher Faulet	4a8c026117	BUG/MINOR: checks/server: use_ssl member must be signed	2020-04-27 12:13:06 +02:00
Christopher Faulet	8021a5f4a5	MINOR: checks: Support list of status codes on http-check expect rules It is now possible to match on a comma-separated list of status codes or range of codes. In addtion, instead of a string comparison to match the response's status code, a integer comparison is performed. Here is an example: http-check expect status 200,201,300-310	2020-04-27 10:46:28 +02:00
Christopher Faulet	88d939c831	Revert "MEDIUM: checks: capture groups in expect regexes" This reverts commit 1979943c30ef285ed04f07ecf829514de971d9b2. Captures in comment was only used when a tcp-check expect based on a negative regex matching failed to eventually report what was captured while it was not expected. It is a bit far-fetched to be useable IMHO. on-error and on-success log-format strings are far more usable. For now there is few check sample fetches (in fact only one...). But it could be really powerful to report info in logs.	2020-04-27 10:46:28 +02:00
Christopher Faulet	d7cee71e77	MINOR: checks: Use a tree instead of a list to store tcp-check rulesets Since all tcp-check rulesets are globally stored, it is a problem to use list. For configuration with many backends, the lookups in list may be costly and slow downs HAProxy startup. To solve this problem, tcp-check rulesets are now stored in a tree.	2020-04-27 10:46:28 +02:00
Christopher Faulet	0417975bdc	MINOR: ist: Add a function to retrieve the ist pointer There is already the istlen() function to get the ist length. Now, it is possible to call istptr() to get the ist pointer.	2020-04-27 10:46:28 +02:00
Christopher Faulet	61cc852230	CLEANUP: checks: Reorg checks.c file to be more readable The patch is not obvious at the first glance. But it is just a reorg. Functions have been grouped and ordered in a more logical way. Some structures and flags are now private to the checks module (so moved from the .h to the .c file).	2020-04-27 10:46:28 +02:00
Christopher Faulet	d7e639661a	MEDIUM: checks: Implement default TCP check using tcp-check rules Defaut health-checks, without any option, doing only a connection check, are now based on tcp-checks. An implicit default tcp-check connect rule is used. A shared tcp-check ruleset, name "*tcp-check" is created to support these checks.	2020-04-27 10:46:28 +02:00
Christopher Faulet	a9e1c4c7c2	MINOR: connection: Add a function to install a mux for a health-check This function is unused for now. But it will have be used to install a mux for an outgoing connection openned in a health-check context. In this case, the session's origin is the check itself, and it is used to know the mode, HTTP or TCP, depending on the tcp-check type and not the proxy mode. The check is also used to get the mux protocol if configured.	2020-04-27 09:39:38 +02:00
Christopher Faulet	b356714769	MINOR: checks: Add a mux proto to health-check and tcp-check connect rule It is not set and not used for now, but it will be possible to force the mux protocol thanks to this patch. A mux proto field is added to the checks and to tcp-check connect rules.	2020-04-27 09:39:38 +02:00
Christopher Faulet	a142c1deb4	BUG/MINOR: obj_type: Handle stream object in obj_base_ptr() function The stream object (OBJ_TYPE_STREAM) was missing in the switch statement of the obj_base_ptr() function. This patch must be backported as far as 2.0.	2020-04-27 09:39:38 +02:00
Christopher Faulet	3829046893	MINOR: checks/obj_type: Add a new object type for checks An object type is now affected to the check structure.	2020-04-27 09:39:38 +02:00
Christopher Faulet	e60abd1a06	MINOR: connection: Add macros to know if a conn or a cs uses an HTX mux IS_HTX_CONN() and IS_HTX_CS may now be used to know if a connection or a conn-stream use an HTX based multiplexer.	2020-04-27 09:39:38 +02:00
Christopher Faulet	e5870d872b	MAJOR: checks: Implement HTTP check using tcp-check rules HTTP health-checks are now internally based on tcp-checks. Of course all the configuration parsing of the "http-check" keyword and the httpchk option has been rewritten. But the main changes is that now, as for tcp-check ruleset, it is possible to perform several send/expect sequences into the same health-checks. Thus the connect rule is now also available from HTTP checks, jst like set-var, unset-var and comment rules. Because the request defined by the "option httpchk" line is used for the first request only, it is now possible to set the method, the uri and the version on a "http-check send" line.	2020-04-27 09:39:38 +02:00
Christopher Faulet	5eb96cbcbc	MINOR: standard: Add my_memspn and my_memcspn Do the same than strsnp() and strcspn() but on a raw bytes buffer.	2020-04-27 09:39:38 +02:00
Christopher Faulet	12d5740a38	MINOR: checks: Introduce flags to configure in tcp-check expect rules Instead of having 2 independent integers, used as boolean values, to know if the expect rule is invered and to know if the matching regexp has captures, we know use a 32-bits bitfield.	2020-04-27 09:39:38 +02:00
Christopher Faulet	f930e4c4df	MINOR: checks: Use an indirect string to represent the expect matching string Instead of having a string in the expect union with its length outside of the union, directly in the expect structure, an indirect string is now used.	2020-04-27 09:39:38 +02:00
Christopher Faulet	404f919995	MEDIUM: checks: Use a shared ruleset to store tcp-check rules All tcp-check rules are now stored in the globla shared list. The ones created to parse a specific protocol, for instance redis, are already stored in this list. Now pure tcp-check rules are also stored in it. The ruleset name is created using the proxy name and its config file and line. tcp-check rules declared in a defaults section are also stored this way using "defaults" as proxy name. For now, all tcp-check ruleset are stored in a list. But it could be a bit slow to looks for a specific ruleset with a huge number of backends. So, it could be a good idea to use a tree instead.	2020-04-27 09:39:38 +02:00
Christopher Faulet	6f5579160a	MINOR: proxy/checks: Move parsing of external-check option in checks.c Parsing of the proxy directive "option external-check" have been moved in checks.c.	2020-04-27 09:39:38 +02:00
Christopher Faulet	430e480510	MINOR: proxy/checks: Move parsing of tcp-check option in checks.c Parsing of the proxy directive "option tcp-check" have been moved in checks.c.	2020-04-27 09:39:38 +02:00
Christopher Faulet	6c2a743538	MINOR: proxy/checks: Move parsing of httpchk option in checks.c Parsing of the proxy directive "option httpchk" have been moved in checks.c.	2020-04-27 09:39:38 +02:00
Christopher Faulet	ec07e386a7	MINOR: checks: Add an option to set success status of tcp-check expect rules It is now possible to specified the healthcheck status to use on success of a tcp-check rule, if it is the last evaluated rule. The option "ok-status" supports "L4OK", "L6OK", "L7OK" and "L7OKC" status.	2020-04-27 09:39:38 +02:00
Christopher Faulet	799f3a4621	MINOR: Produce tcp-check info message for pure tcp-check rules only This way, messages reported by protocol checks are closer that the old one.	2020-04-27 09:39:38 +02:00
Christopher Faulet	0ae3d1dbdf	MEDIUM: checks: Implement agent check using tcp-check rules A shared tcp-check ruleset is now created to support agent checks. The following sequence is used : tcp-check send "%[var(check.agent_string)] log-format tcp-check expect custom The custom function to evaluate the expect rule does the same that it was done to handle agent response when a custom check was used.	2020-04-27 09:39:38 +02:00
Christopher Faulet	267b01b761	MEDIUM: checks: Implement SPOP check using tcp-check rules A share tcp-check ruleset is now created to support SPOP checks. This way no extra memory is used if several backends use a SPOP check. The following sequence is used : tcp-check send-binary SPOP_REQ tcp-check expect custom min-recv 4 The spop request is the result of the function spoe_prepare_healthcheck_request() and the expect rule relies on a custom function calling spoe_handle_healthcheck_response().	2020-04-27 09:39:38 +02:00
Christopher Faulet	1997ecaa0c	MEDIUM: checks: Implement LDAP check using tcp-check rules A shared tcp-check ruleset is now created to support LDAP check. This way no extra memory is used if several backends use a LDAP check. The following sequance is used : tcp-check send-binary "300C020101600702010304008000" tcp-check expect rbinary "^30" min-recv 14 \ on-error "Not LDAPv3 protocol" tcp-check expect custom The last expect rule relies on a custom function to check the LDAP server reply.	2020-04-27 09:39:38 +02:00
Christopher Faulet	f2b3be5c27	MEDIUM: checks: Implement MySQL check using tcp-check rules A share tcp-check ruleset is now created to support MySQL checks. This way no extra memory is used if several backends use a MySQL check. One for the following sequence is used : ## If no extra params are set tcp-check connect default linger tcp-check expect custom ## will test the initial handshake ## If the username is defined tcp-check connect default linger tcp-check send-binary MYSQL_REQ log-format tcp-check expect custom ## will test the initial handshake tcp-check expect custom ## will test the reply to the client message The log-format hexa string MYSQL_REQ depends on 2 preset variables, the packet header containing the packet length and the sequence ID (check.header) and the username (check.username). If is also different if the "post-41" option is set or not. Expect rules relies on custom functions to check MySQL server packets.	2020-04-27 09:39:38 +02:00
Christopher Faulet	ce355074f1	MEDIUM: checks: Implement postgres check using tcp-check rules A shared tcp-check ruleset is now created to support postgres check. This way no extra memory is used if several backends use a pgsql check. The following sequence is used : tcp-check connect default linger tcp-check send-binary PGSQL_REQ log-format tcp-check expect !rstring "^E" min-recv 5 \ error-status "L7RSP" on-error "%[check.payload(6,0)]" tcp-check expect rbinary "^520000000800000000 min-recv "9" \ error-status "L7STS" \ on-success "PostgreSQL server is ok" \ on-error "PostgreSQL unknown error" The log-format hexa string PGSQL_REQ depends on 2 preset variables, the packet length (check.plen) and the username (check.username).	2020-04-27 09:39:38 +02:00
Christopher Faulet	fbcc77c6ba	MEDIUM: checks: Implement smtp check using tcp-check rules A share tcp-check ruleset is now created to support smtp checks. This way no extra memory is used if several backends use a smtp check. The following sequence is used : tcp-check connect default linger tcp-check expect rstring "^[0-9]{3}[ \r]" min-recv 4 \ error-status "L7RSP" on-error "%[check.payload(),cut_crlf]" tcp-check expect rstring "^2[0-9]{2}[ \r]" min-recv 4 \ error-status "L7STS" \ on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \ status-code "check.payload(0,3)" tcp-echeck send "%[var(check.smtp_cmd)]\r\n" log-format tcp-check expect rstring "^2[0-9]{2}[- \r]" min-recv 4 \ error-status "L7STS" \ on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \ on-success "%[check.payload(4,0),ltrim(' '),cut_crlf]" \ status-code "check.payload(0,3)" The variable check.smtp_cmd is by default the string "HELO localhost" by may be customized setting <helo> and <domain> parameters on the option smtpchk line. Note there is a difference with the old smtp check. The server gretting message is checked before send the HELO/EHLO comand.	2020-04-27 09:39:38 +02:00
Christopher Faulet	811f78ced1	MEDIUM: checks: Implement ssl-hello check using tcp-check rules A shared tcp-check ruleset is now created to support ssl-hello check. This way no extra memory is used if several backends use a ssl-hello check. The following sequence is used : tcp-check send-binary SSLV3_CLIENT_HELLO log-format tcp-check expect rbinary "^1[56]" min-recv 5 \ error-status "L6RSP" tout-status "L6TOUT" SSLV3_CLIENT_HELLO is a log-format hexa string representing a SSLv3 CLIENT HELLO packet. It is the same than the one used by the old ssl-hello except the sample expression "%[date(),htonl,hex]" is used to set the date field.	2020-04-27 09:39:38 +02:00
Christopher Faulet	33f05df650	MEDIUM: checks: Implement redis check using tcp-check rules A share tcp-check ruleset is now created to support redis checks. This way no extra memory is used if several backends use a redis check. The following sequence is used : tcp-check send "*1\r\n$4\r\nPING\r\n" tcp-check expect string "+PONG\r\n" error-status "L7STS" \ on-error "%[check.payload(),cut_crlf]" on-success "Redis server is ok"	2020-04-27 09:39:38 +02:00
Christopher Faulet	9e6ed1598e	MINOR: checks: Support custom functions to eval a tcp-check expect rules It is now possible to set a custom function to evaluate a tcp-check expect rule. It is an internal and not documentd option because the right pointer of function must be set and it is not possible to express it in the configuration. It will be used to convert some protocol healthchecks to tcp-checks. Custom functions must have the following signature: enum tcpcheck_eval_ret (custom)(struct check , struct tcpcheck_rule *, int);	2020-04-27 09:39:38 +02:00
Christopher Faulet	6f87adcf20	MINOR: checks: Export the tcpcheck_eval_ret enum This enum will be used to define custom function for tcp-check expect rules.	2020-04-27 09:39:38 +02:00
Christopher Faulet	7a1e2e1823	MEDIUM: checks: Add a list of vars to set before executing a tpc-check ruleset A list of variables is now associated to each tcp-check ruleset. It is more a less a list of set-var expressions. This list may be filled during the configuration parsing. The listed variables will then be set during each execution of the tcp-check healthcheck, at the begining, before execution of the the first tcp-check rule. This patch is mandatory to convert all protocol checks to tcp-checks. It is a way to customize shared tcp-check rulesets.	2020-04-27 09:39:37 +02:00
Christopher Faulet	bb591a1a11	MINOR: checks: Relax the default option for tcp-check connect rules Now this option may be mixed with other options. This way, options on the server line are used but may be overridden by tcp-check connect options.	2020-04-27 09:39:37 +02:00
Christopher Faulet	98cc57cf5c	MEDIUM: checks: Add status-code sample expression on tcp-check expect rules This option defines a sample expression, evaluated as an integer, to set the status code (check->code) if a tcp-check healthcheck ends on the corresponding expect rule.	2020-04-27 09:39:37 +02:00
Christopher Faulet	be52b4de66	MEDIUM: checks: Add on-error/on-success option on tcp-check expect rules These options define log-format strings used to produce the info message if a tcp-check expect rule fails (on-error option) or succeeds (on-success option). For this last option, it must be the ending rule, otherwise the parameter is ignored.	2020-04-27 09:39:37 +02:00
Christopher Faulet	cf80f2f263	MINOR: checks: Add option to tcp-check expect rules to customize error status It is now possible to specified the healthcheck status to use on error or on timeout for tcp-check expect rules. First, to define the error status, the option "error-status" must be used followed by "L4CON", "L6RSP", "L7RSP" or "L7STS". Then, to define the timeout status, the option "tout-status" must be used followed by "L4TOUT", "L6TOUT" or "L7TOUT". These options will be used to convert specific protocol healthchecks (redis, pgsql...) to tcp-check ones. x	2020-04-27 09:39:37 +02:00
Christopher Faulet	1032059bd0	MINOR: checks: Use a name for the healthcheck status enum The enum defining all healthcheck status (HCHK_STATUS_*) is now named.	2020-04-27 09:39:37 +02:00
Christopher Faulet	5d503fcf5b	MEDIUM: checks: Add a shared list of tcp-check rules A global list to tcp-check ruleset can now be used to share common rulesets with all backends without any duplication. It is mandatory to convert all specific protocol checks (redis, pgsql...) to tcp-check healthchecks. To do so, a flag is now attached to each tcp-check ruleset to know if it is a shared ruleset or not. tcp-check rules defined in a backend are still directly attached to the proxy and not shared. In addition a second flag is used to know if the ruleset is inherited from the defaults section.	2020-04-27 09:39:37 +02:00
Christopher Faulet	f50f4e956f	MEDIUM: checks: Support log-format strings for tcp-check send rules An extra parameter for tcp-check send rules can be specified to handle the string or the hexa string as a log-format one. Using "log-format" option, instead of considering the data to send as raw data, it is parsed as a log-format string. Thus it is possible to call sample fetches to customize data sent to a server. Of course, because we have no stream attached to healthchecks, not all sample fetches are available. So be careful. tcp-check set-var(check.port) int(8000) tcp-check set-var(check.uri) str(/status) tcp-check connect port var(check.port) tcp-check send "GET %[check.uri] HTTP/1.0\r\n" log-format tcp-check send "Host: %[srv_name]\r\n" log-format tcp-check send "\r\n"	2020-04-27 09:39:37 +02:00
Christopher Faulet	b7d30098f3	MEDIUM: checks: Support expression to set the port Since we have a session attached to tcp-check healthchecks, It is possible use sample expression and variables. In addition, it is possible to add tcp-check set-var rules to define custom variables. So, now, a sample expression can be used to define the port to use to establish a connection for a tcp-check connect rule. For instance: tcp-check set-var(check.port) int(8888) tcp-check connect port var(check.port)	2020-04-27 09:39:37 +02:00
Christopher Faulet	5c28874a69	MINOR: checks: Add the addr option for tcp-check connect rule With this option, it is now possible to use a specific address to open the connection for a tcp-check connect rule. If the port option is also specified, it is used in priority.	2020-04-27 09:39:37 +02:00
Christopher Faulet	d75f57e94c	MINOR: ssl: Export a generic function to parse an alpn string Parsing of an alpn string has been moved in a dedicated function and exposed to be used from outside the ssl_sock module.	2020-04-27 09:39:37 +02:00
Christopher Faulet	085426aea9	MINOR: checks: Add the via-socks4 option for tcp-check connect rules With this option, it is possible to establish the connection opened by a tcp-check connect rule using upstream socks4 proxy. Info from the socks4 parameter on the server are used.	2020-04-27 09:39:37 +02:00
Christopher Faulet	79b31d4ee5	MINOR: checks: Add the sni option for tcp-check connect rules With this option, it is possible to specify the SNI to be used for SSL conncection opened by a tcp-check connect rule.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	707b52f17e	MEDIUM: checks: Parse custom action rules in tcp-checks Register the custom action rules "set-var" and "unset-var", that will call the parse_store() command upon parsing. These rules are thus built and integrated to the tcp-check ruleset, but have no further effect for the moment.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	13a5043a9e	MINOR: checks/vars: Add a check scope for variables Add a dedicated vars scope for checks. This scope is considered as part of the session scope for accounting purposes. The scope can be addressed by a valid session, even embryonic. The stream is not necessary. The scope is initialized after the check session is created. All variables are then pruned before the session is destroyed.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	05d692dc09	MEDIUM: checks: Associate a session to each tcp-check healthcheck Create a session for each healthcheck relying on a tcp-check ruleset. When such check is started, a session is allocated, which will be freed when the check finishes. A dummy static frontend is used to create these sessions. This will be useful to support variables and sample expression. This will also be used, later, by HTTP healthchecks to rely on HTTP muxes.	2020-04-27 09:39:37 +02:00
Christopher Faulet	b2c2e0fcca	MAJOR: checks: Refactor and simplify the tcp-check loop The loop in tcpcheck_main() function is quite hard to understand. Depending where we are in the loop, The current_step is the currentely executed rule or the one to execute on the next call to tcpcheck_main(). When the check result is reported, we rely on the rule pointed by last_started_step or the one pointed by current_step. In addition, the loop does not use the common list_for_each_entry macro and it is thus quite confusing. So the loop has been totally rewritten and splitted to several functions to simplify its reading and its understanding. Tcp-check rules are evaluated in dedicated functions. And a common for_each loop is used and only one rule is referenced, the current one.	2020-04-27 09:39:37 +02:00
Christopher Faulet	a202d1d4c1	MEDIUM: checks: Add implicit tcp-check connect rule After the configuration parsing, when its validity check, an implicit tcp-check connect rule is added in front of the tcp-check ruleset if the first non-comment rule is not a connect one. This implicit rule is flagged to use the default check parameter. This means now, all tcp-check rulesets begin with a connect and are never empty. When tcp-check healthchecks are used, all connections are thus handled by tcpcheck_main() function.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	06d963aeca	MINOR: checks: define a tcp-check connect type The check rule itself is not changed, only its representation.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	48219dc50e	MINOR: checks: define tcp-check send type The check rule itself is not changed, only its representation.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	5301b01f99	MINOR: checks: Set the tcp-check rule index during parsing Now the position of a tcp-check rule in a chain is set during the parsing. This simplify significantly the function retrieving the current step id.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	04578dbf37	MINOR: checks: Don't use a static tcp rule list head To allow reusing these blocks without consuming more memory, their list should be static and share-able accross uses. The head of the list will be shared as well. It is thus necessary to extract the head of the rule list from the proxy itself. Transform it into a pointer instead, that can be easily set to an external dynamically allocated head.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	9dcb09fc98	MEDIUM: checks: capture groups in expect regexes Parse back-references in comments of tcp-check expect rules. If references are made, capture groups in the match and replace references to it within the comment when logging the error. Both text and binary regex can caputre groups and reference them in the expect rule comment. [Cf: I slightly updated the patch. exp_replace() function is used instead of a custom one. And if the trash buffer is too small to contain the comment during the substitution, the comment is ignored.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	efab6c61d9	MINOR: checks: add rbinary expect match type The rbinary match works similarly to the rstring match type, however the received data is rewritten as hex-string before the match operation is done. This allows using regexes on binary content even with the POSIX regex engine. [Cf: I slightly updated the patch. mem2hex function was removed and dump_binary is used instead.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	b616add793	MINOR: checks: define a tcp expect type Extract the expect definition from its tcpcheck ; create a standalone type.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	f8ba6773e5	MINOR: checks: add linger option to tcp connect Allow declaring tcpcheck connect commands with a new parameter, "linger". This option will configure the connection to avoid using an RST segment to close, instead following the four-way termination handshake. Some servers would otherwise log each healthcheck as an error.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	1afd826ae4	MINOR: checks: add min-recv tcp-check expect option Some expect rules cannot be satisfied due to inherent ambiguity towards the received data: in the absence of match, the current behavior is to be forced to wait either the end of the connection or a buffer full, whichever comes first. Only then does the matching diagnostic is considered conclusive. For instance : tcp-check connect tcp-check expect !rstring "^error" tcp-check expect string "valid" This check will only succeed if the connection is closed by the server before the check timeout. Otherwise the first expect rule will wait for more data until "^error" regex matches or the check expires. Allow the user to explicitly define an amount of data that will be considered enough to determine the value of the check. This allows succeeding on negative rstring rules, as previously in valid condition no match happened, and the matching was repeated until the end of the connection. This could timeout the check while no error was happening. [Cf: I slighly updated the patch. The parameter was renamed and the value is a signed integer to support -1 as default value to ignore the parameter.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	4038b94706	MEDIUM: checks: rewind to the first inverse expect rule of a chain on new data When receiving additional data while chaining multiple tcp-check expects, previous inverse expects might have a different result with the new data. They need to be evaluated again against the new data. Add a pointer to the first inverse expect rule of the current expect chain (possibly of length one) to each expect rule. When receiving new data, the currently evaluated tcp-check rule is set back to this pointed rule. Fonctionnaly speaking, it is a bug and it exists since the introduction of the feature. But there is no way for now to hit it because when an expect rule does not match, we wait for more data, independently on the inverse flag. The only way to move to the following rule is to be sure no more data will be received. This patch depends on the commit "MINOR: mini-clist: Add functions to iterate backward on a list". [Cf: I slightly updated the patch. First, it only concerns inverse expect rule. Normal expect rules are not concerned. Then, I removed the BUG tag because, for now, it is not possible to move to the following rule when the current one does not match while more data can be received.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	dd66732ffe	MINOR: checks: Use an enum to describe the tcp-check rule type Replace the generic integer with an enumerated list. This allows light type check and helps debugging (seeing action = 2 in the struct is not helpful).	2020-04-27 09:39:37 +02:00
Christopher Faulet	31c30fdf1e	CLEANUP: checks: Don't export anymore init_check and srv_check_healthcheck_port These functions are no longer called outside the checks.	2020-04-27 09:39:37 +02:00
Christopher Faulet	f61f33a1b2	BUG/MINOR: checks: Respect the no-check-ssl option This options is used to force a non-SSL connection to check a SSL server or to invert a check-ssl option inherited from the default section. The use_ssl field in the check structure is used to know if a SSL connection must be used (use_ssl=1) or not (use_ssl=0). The server configuration is used by default. The problem is that we cannot distinguish the default case (no specific SSL check option) and the case of an explicit non-SSL check. In both, use_ssl is set to 0. So the server configuration is always used. For a SSL server, when no-check-ssl option is set, the check is still performed using a SSL configuration. To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value : * 0 = use server config * 1 = force SSL * -1 = force non-SSL The same is done for the server parameter. It is not really necessary for now. But it is a good way to know is the server no-ssl option is set. In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl to 1 for a check. Instead the flag is directly tested to prepare or destroy the server SSL context. This patch should be backported as far as 1.8.	2020-04-27 09:39:37 +02:00
Christopher Faulet	8acb1284bc	MINOR: checks: Add a way to send custom headers and payload during http chekcs The 'http-check send' directive have been added to add headers and optionnaly a payload to the request sent during HTTP healthchecks. The request line may be customized by the "option httpchk" directive but there was not official way to add extra headers. An old trick consisted to hide these headers at the end of the version string, on the "option httpchk" line. And it was impossible to add an extra payload with an "http-check expect" directive because of the "Connection: close" header appended to the request (See issue #16 for details). So to make things official and fully support payload additions, the "http-check send" directive have been added : option httpchk POST /status HTTP/1.1 http-check send hdr Content-Type "application/json;charset=UTF-8" \ hdr X-test-1 value1 hdr X-test-2 value2 \ body "{id: 1, field: \"value\"}" When a payload is defined, the Content-Length header is automatically added. So chunk-encoded requests are not supported yet. For now, there is no special validity checks on the extra headers. This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and as far as possible, it may be backported, at least as far as 1.8.	2020-04-27 09:39:37 +02:00
Christopher Faulet	bc1f54b0fc	MINOR: mini-clist: Add functions to iterate backward on a list list_for_each_entry_rev() and list_for_each_entry_from_rev() and corresponding safe versions have been added to iterate on a list in the reverse order. All these functions work the same way than the forward versions, except they use the .p field to move for an element to another.	2020-04-27 09:39:37 +02:00
Christopher Faulet	aaae9a0e99	BUG/MINOR: check: Update server address and port to execute an external check Server address and port may change at runtime. So the address and port passed as arguments and as environment variables when an external check is executed must be updated. The current number of connections on the server was already updated before executing the command. So the same mechanism is used for the server address and port. But in addition, command arguments are also updated. This patch must be backported to all stable versions. It should fix the issue #577.	2020-04-27 09:39:13 +02:00
Willy Tarreau	62ba9ba6ca	BUG/MINOR: http: make url_decode() optionally convert '+' to SP The url_decode() function used by the url_dec converter and a few other call points is ambiguous on its processing of the '+' character which itself isn't stable in the spec. This one belongs to the reserved characters for the query string but not for the path nor the scheme, in which it must be left as-is. It's only in argument strings that follow the application/x-www-form-urlencoded encoding that it must be turned into a space, that is, in query strings and POST arguments. The problem is that the function is used to process full URLs and paths in various configs, and to process query strings from the stats page for example. This patch updates the function to differentiate the situation where it's parsing a path and a query string. A new argument indicates if a query string should be assumed, otherwise it's only assumed after seeing a question mark. The various locations in the code making use of this function were updated to take care of this (most call places were using it to decode POST arguments). The url_dec converter is usually called on path or url samples, so it needs to remain compatible with this and will default to parsing a path and turning the '+' to a space only after a question mark. However in situations where it would explicitly be extracted from a POST or a query string, it now becomes possible to enforce the decoding by passing a non-null value in argument. It seems to be what was reported in issue #585. This fix may be backported to older stable releases.	2020-04-23 20:03:27 +02:00
Willy Tarreau	09568fd54d	BUG/MINOR: tools: fix the i386 version of the div64_32 function As reported in issue #596, the edx register isn't marked as clobbered in div64_32(), which could technically allow gcc to try to reuse it if it needed a copy of the 32 highest bits of the o1 register after the operation. Two attempts were tried, one using a dummy 32-bit local variable to store the intermediary edx and another one switching to "=A" and making result a long long. It turns out the former makes the resulting object code significantly dirtier while the latter makes it better and was kept. This is due to gcc's difficulties at working with register pairs mixing 32- and 64- bit values on i386. It was verified that no code change happened at all on x86_64, armv7, aarch64 nor mips32. In practice it's only used by the frequency counters so this bug cannot even be triggered but better fix it. This may be backported to stable branches though it will not fix any issue.	2020-04-23 17:21:37 +02:00
Ilya Shipitsin	856aabcda5	CLEANUP: assorted typo fixes in the code and comments This is 8th iteration of typo fixes	2020-04-17 09:37:36 +02:00
Willy Tarreau	bb86986253	MINOR: init: report the haproxy version and executable path once on errors If haproxy fails to start and emits an alert, then it can be useful to have it also emit the version and the path used to load it. Some users may be mistakenly launching the wrong binary due to a misconfigured PATH variable and this will save them some troubleshooting time when it reports that some keywords are not understood. What we do here is that we try to extract the binary name from the AUX vector on glibc, and we report this as a NOTICE tag before the very first alert is emitted.	2020-04-16 10:52:41 +02:00
Ilya Shipitsin	d425950c68	CLEANUP: assorted typo fixes in the code and comments This is 7th iteration of typo fixes	2020-04-16 10:04:36 +02:00
Willy Tarreau	3eb10b8e98	MINOR: init: add -dW and "zero-warning" to reject configs with warnings Since some systems switched to service managers which hide all warnings by default, some users are not aware of some possibly important warnings and get caught too late with errors that could have been detected earlier. This patch adds a new global keyword, "zero-warning" and an equivalent command-line option "-dW" to refuse to start in case any warning is detected. It is recommended to use these with configurations that are managed by humans in order to catch mistakes very early.	2020-04-15 16:42:39 +02:00
Willy Tarreau	bebd212064	MINOR: init: report in "haproxy -c" whether there were warnings or not This helps quickly checking if the config produces any warning. For this we reuse the "warned" bit field to add a new WARN_ANY bit that is set by ha_warning(). The rest of the bit field was also cleaned from unused bits.	2020-04-15 16:42:00 +02:00
Fr�d�ric L�caille	8ba10fea69	BUG/MINOR: peers: Incomplete peers sections should be validated. Before supporting "server" line in "peers" section, such sections without any local peer were removed from the configuration to get it validated. This patch fixes the issue where a "server" line without address and port which is a remote peer without address and port makes the configuration parsing fail. When encoutering such cases we now ignore such lines remove them from the configuration. Thank you to J�r�me Magnin for having reported this bug. Must be backported to 2.1 and 2.0.	2020-04-15 10:47:39 +02:00
William Lallemand	b7296c42bd	CLEANUP: ssl: remove a commentary in struct ckch_inst The struct ckch_inst now handles the ssl_bind_conf so this commentary is obsolete	2020-04-09 16:13:42 +02:00
William Lallemand	caa161982f	CLEANUP: ssl/cli: use the list of filters in the crtlist_entry In 'commit ssl cert', instead of trying to regenerate a list of filters from the SNIs, use the list provided by the crtlist_entry used to generate the ckch_inst. This list of filters doesn't need to be free'd anymore since they are always reused from the crtlist_entry.	2020-04-08 16:52:51 +02:00
William Lallemand	02e19a5c7b	CLEANUP: ssl: use the refcount for the SSL_CTX' Use the refcount of the SSL_CTX' to free them instead of freeing them on certains conditions. That way we can free the SSL_CTX everywhere its pointer is used.	2020-04-08 16:52:51 +02:00
William Lallemand	c69f02d0f0	MINOR: ssl/cli: replace dump/show ssl crt-list by '-n' option The dump and show ssl crt-list commands does the same thing, they dump the content of a crt-list, but the 'show' displays an ID in the first column. Delete the 'dump' command so it is replaced by the 'show' one. The old 'show' command is replaced by an '-n' option to dump the ID. And the ID which was a pointer is replaced by a line number and placed after colons in the filename. Example: $ echo "show ssl crt-list -n kikyo.crt-list" \| socat /tmp/sock1 - # kikyo.crt-list kikyo.pem.rsa:1 secure.domain.tld kikyo.pem.ecdsa:2 secure.domain.tld	2020-04-06 19:33:33 +02:00
Fr�d�ric L�caille	876ed55d9b	BUG/MINOR: protocol_buffer: Wrong maximum shifting. This patch fixes a bad stop condition when decoding a protocol buffer variable integer whose maximum lenghts are 10, shifting a uint64_t value by more than 63. Thank you to Ilya for having reported this issue. Must be backported to 2.1 and 2.0.	2020-04-02 15:09:46 +02:00
Olivier Houchard	4a0e7fe4f7	MINOR: connections: Don't mark conn flags 0x00000001 and 0x00000002 as unused. Remove the comments saying 0x00000001 and 0x00000002 are unused, they are now used by CO_FL_SAFE_LIST and CO_FL_IDLE_LIST.	2020-03-31 23:04:20 +02:00
William Lallemand	fa8cf0c476	MINOR: ssl: store a ptr to crtlist in crtlist_entry Store a pointer to crtlist in crtlist_entry so we can re-insert a crtlist_entry in its crtlist ebpt after updating its key.	2020-03-31 12:32:17 +02:00
William Lallemand	23d61c00b9	MINOR: ssl: add a list of crtlist_entry in ckch_store When updating a ckch_store we may want to update its pointer in the crtlist_entry which use it. To do this, we need the list of the entries using the store.	2020-03-31 12:32:17 +02:00
William Lallemand	493983128b	BUG/MINOR: ssl: ckch_inst wrongly inserted in crtlist_entry The instances were wrongly inserted in the crtlist entries, all instances of a crt-list were inserted in the last crt-list entry. Which was kind of handy to free all instances upon error. Now that it's done correctly, the error path was changed, it must iterate on the entries and find the ckch_insts which were generated for this bind_conf. To avoid wasting time, it stops the iteration once it found the first unsuccessful generation.	2020-03-31 12:32:17 +02:00
William Lallemand	ad3c37b760	REORG: ssl: move SETCERT enum to ssl_sock.h Move the SETCERT enum at the right place to cleanup ssl_sock.c.	2020-03-31 12:32:17 +02:00
William Lallemand	79d31ec0d4	MINOR: ssl: add a list of bind_conf in struct crtlist In order to be able to add new certificate in a crt-list, we need the list of bind_conf that uses this crt-list so we can create a ckch_inst for each of them.	2020-03-31 12:32:17 +02:00
William Lallemand	638f6ad033	MINOR: cli: add a general purpose pointer in the CLI struct This patch adds a p2 generic pointer which is inialized to zero before calling the parser.	2020-03-31 12:32:17 +02:00
Olivier Houchard	cf612a0457	MINOR: servers: Add a counter for the number of currently used connections. Add a counter to know the current number of used connections, as well as the max, this will be used later to refine the algorithm used to kill idle connections, based on current usage.	2020-03-30 00:30:01 +02:00
Jerome Magnin	824186bb08	MEDIUM: stream: support use-server rules with dynamic names With server-template was introduced the possibility to scale the number of servers in a backend without needing a configuration change and associated reload. On the other hand it became impractical to write use-server rules for these servers as they would only accept existing server labels as argument. This patch allows the use of log-format notation to describe targets of a use-server rules, such as in the example below: listen test bind *:1234 use-server %[hdr(srv)] if { hdr(srv) -m found } use-server s1 if { path / } server s1 127.0.0.1:18080 server s2 127.0.0.1:18081 If a use-server rule is applied because it was conditionned by an ACL returning true, but the target of the use-server rule cannot be resolved, no other use-server rule is evaluated and we fall back to load balancing. This feature was requested on the ML, and bumped with issue #563.	2020-03-29 09:55:10 +02:00
Olivier Houchard	dbda31939d	BUG/MINOR: connections: Set idle_time before adding to idle list. In srv_add_to_idle_list(), make sure we set the idle_time before we add the connection to an idle list, not after, otherwise another thread may grab it, set the idle_time to 0, only to have the original thread set it back to now_ms. This may have an impact, as in conn_free() we check idle_time to decide if we should decrement the idle connection counters for the server.	2020-03-22 20:05:59 +01:00
Olivier Houchard	ad91124bcf	BUILD/MEDIUM: fd: Declare fd_mig_lock as extern. Declare fd_mig_lock as extern so that it isn't defined multiple times. This should fix build for architectures without double-width CAS.	2020-03-20 11:42:11 +01:00
Olivier Houchard	566df309c6	MEDIUM: connections: Attempt to get idle connections from other threads. In connect_server(), if we no longer have any idle connections for the current thread, attempt to use the new "takeover" mux method to steal a connection from another thread. This should have no impact right now, given no mux implements it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	d2489e00b0	MINOR: connections: Add a flag to know if we're in the safe or idle list. Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one know we are in the safe list, or the idle list.	2020-03-19 22:07:33 +01:00
Olivier Houchard	f0d4dff25c	MINOR: connections: Make the "list" element a struct mt_list instead of list. Make the "list" element a struct mt_list, and explicitely use list_from_mt_list to get a struct list * where it is used as such, so that mt_list_for_each_entry will be usable with it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	00bdce24d5	MINOR: connections: Add a new mux method, "takeover". Add a new mux method, "takeover", that will attempt to make the current thread responsible for the connection. It should return 0 on success, and non-zero on failure.	2020-03-19 22:07:33 +01:00
Olivier Houchard	8851664293	MINOR: fd: Implement fd_takeover(). Implement a new function, fd_takeover(), that lets you become the thread responsible for the fd. On architectures that do not have a double-width CAS, use a global rwlock. fd_set_running() was also changed to be able to compete with fd_takeover(), either using a dooble-width CAS on both running_mask and thread_mask, or by claiming a reader on the global rwlock. This extra operation should not have any measurable impact on modern architectures where threading is relevant.	2020-03-19 22:07:33 +01:00
Olivier Houchard	dc2f2753e9	MEDIUM: servers: Split the connections into idle, safe, and available. Revamp the server connection lists. We know have 3 lists : - idle_conns, which contains idling connections - safe_conns, which contains idling connections that are safe to use even for the first request - available_conns, which contains connections that are not idling, but can still accept new streams (those are HTTP/2 or fastcgi, and are always considered safe).	2020-03-19 22:07:33 +01:00
Olivier Houchard	2444aa5b66	MEDIUM: sessions: Don't be responsible for connections anymore. Make it so sessions are not responsible for connection anymore, except for connections that are private, and thus can't be shared, otherwise, as soon as a request is done, the session will just add the connection to the orphan connections pool. This will break http-reuse safe, but it is expected to be fixed later.	2020-03-19 22:07:33 +01:00
Olivier Houchard	899fb8abdc	MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc. The flush_lock was introduced, mostly to be sure that pool_gc() will never dereference a pointer that has been free'd. __pool_get_first() was acquiring the lock to, the fear was that otherwise that pointer could get free'd later, and then pool_gc() would attempt to dereference it. However, that can not happen, because the only functions that can free a pointer, when using lockless pools, are pool_gc() and pool_flush(), and as long as those two are mutually exclusive, nobody will be able to free the pointer while pool_gc() attempts to access it. So change the flush_lock to a spinlock, and don't bother acquire/release it in __pool_get_first(), that way callers of __pool_get_first() won't have to wait while the pool is flushed. The worst that can happen is we call __pool_refill_alloc() while the pool is getting flushed, and memory can get allocated just to be free'd. This may help with github issue #552 This may be backported to 2.1, 2.0 and 1.9.	2020-03-18 15:55:35 +01:00
Olivier Houchard	de01ea9878	MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h. Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into types/signal.h, so that we can access them in another file. We need those definition to avoid blocking those signals when running __signal_process_queue(). This should be backported to 2.1, 2.0 and 1.9.	2020-03-18 13:07:19 +01:00
Olivier Houchard	a7bf573520	MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock. In the struct fdtab, introduce a new mask, running_mask. Each thread should add its bit before using the fd. Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just spin as long as the mask is non-zero, to be sure we access the data exclusively. fd_set_running_excl() spins until the mask is 0, fd_set_running() just adds the thread bit, and fd_clr_running() removes it.	2020-03-17 15:30:07 +01:00
William Lallemand	2954c478eb	MEDIUM: ssl: allow crt-list caching The crtlist structure defines a crt-list in the HAProxy configuration. It contains crtlist_entry structures which are the lines in a crt-list file. crt-list are now loaded in memory using crtlist and crtlist_entry structures. The file is read only once. The generation algorithm changed a little bit, new ckch instances are generated from the crtlist structures, instead of being generated during the file loading. The loading function was split in two, one that loads and caches the crt-list and certificates, and one that looks for a crt-list and creates the ckch instances. Filters are also stored in crtlist_entry->filters as a char ** so we can generate the sni_ctx again if needed. I won't be needed anymore to parse the sni_ctx to do that. A crtlist_entry stores the list of all ckch_inst that were generated from this entry.	2020-03-16 16:18:49 +01:00
Willy Tarreau	e4d42551bd	BUILD: pools: silence build warnings with DEBUG_MEMORY_POOLS and DEBUG_UAF With these debug options we still get these warnings: include/common/memory.h:501:23: warning: null pointer dereference [-Wnull-dereference] (volatile int )0 = 0; ~~~~~~~~~~~~~~~~~~~^~~ include/common/memory.h:460:22: warning: null pointer dereference [-Wnull-dereference] (volatile int )0 = 0; ~~~~~~~~~~~~~~~~~~~^~~ These are purposely there to crash the process at specific locations. But the annoying warnings do not help with debugging and they are not even reliable as the compiler may decide to optimize them away. Let's pass the pointer through DISGUISE() to avoid this.	2020-03-14 11:10:21 +01:00
Willy Tarreau	2e8ab6b560	MINOR: use DISGUISE() everywhere we deliberately want to ignore a result It's more generic and versatile than the previous shut_your_big_mouth_gcc() that was used to silence annoying warnings as it's not limited to ignoring syscalls returns only. This allows us to get rid of the aforementioned function and the shut_your_big_mouth_gcc_int variable, that started to look ugly in multi-threaded environments.	2020-03-14 11:04:49 +01:00
Willy Tarreau	15ed69fd3f	MINOR: debug: consume the write() result in BUG_ON() to silence a warning Tim reported that BUG_ON() issues warnings on his distro, as the libc marks some syscalls with __attribute__((warn_unused_result)). Let's pass the write() result through DISGUISE() to hide it.	2020-03-14 10:58:35 +01:00
Willy Tarreau	f401668306	MINOR: debug: add a new DISGUISE() macro to pass a value as identity This does exactly the same as ALREADY_CHECKED() but does it inline, returning an identical copy of the scalar variable without letting the compiler know how it might have been transformed. This can forcefully disable certain null-pointer checks or result checks when known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0 will not cause a null-deref warning.	2020-03-14 10:52:46 +01:00
Ilya Shipitsin	77e3b4a2c4	CLEANUP: assorted typo fixes in the code and comments These are mostly comments in the code. A few error messages were fixed and are of low enough importance not to deserve a backport. Some regtests were also fixed.	2020-03-14 09:42:07 +01:00
Tim Duesterhus	cf6e0c8a83	MEDIUM: proxy_protocol: Support sending unique IDs using PPv2 This patch adds the `unique-id` option to `proxy-v2-options`. If this option is set a unique ID will be generated based on the `unique-id-format` while sending the proxy protocol v2 header and stored as the unique id for the first stream of the connection. This feature is meant to be used in `tcp` mode. It works on HTTP mode, but might result in inconsistent unique IDs for the first request on a keep-alive connection, because the unique ID for the first stream is generated earlier than the others. Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made available in TCP mode.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	d1b15b6e9b	MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections This patch reads a proxy protocol v2 provided unique ID and makes it available using the `fc_pp_unique_id` fetch.	2020-03-13 17:25:23 +01:00
Tim Duesterhus	b435f77620	DOC: proxy_protocol: Reserve TLV type 0x05 as PP2_TYPE_UNIQUE_ID This reserves and defines TLV type 0x05.	2020-03-13 17:25:23 +01:00
Olivier Houchard	84fd8a77b7	MINOR: lists: fix indentation. Fix indentation in the recently added list_to_mt_list().	2020-03-11 21:41:13 +01:00
Olivier Houchard	8676514d4e	MINOR: servers: Kill priv_conns. Remove the list of private connections from server, it has been largely unused, we only inserted connections in it, but we would never actually use it.	2020-03-11 19:20:01 +01:00
Olivier Houchard	751e5e21a9	MINOR: lists: Implement function to convert list => mt_list and mt_list => list Implement mt_list_to_list() and list_to_mt_list(), to be able to convert from a struct list to a struct mt_list, and vice versa. This is normally of no use, except for struct connection's list field, that can go in either a struct list or a struct mt_list.	2020-03-11 17:10:40 +01:00
Olivier Houchard	49983a9fe1	MINOR: mt_lists: Appease gcc. gcc is confused, and think p may end up being NULL in _MT_LIST_RELINK_DELETED. It should never happen, so let gcc know that.	2020-03-11 17:10:08 +01:00
Willy Tarreau	638698da37	BUILD: stream-int: fix a few includes dependencies The stream-int code doesn't need to load server.h as it doesn't use servers at all. However removing this one reveals that proxy.h was lacking types/checks.h that used to be silently inherited from types/server.h loaded before in stream_interface.h.	2020-03-11 14:15:33 +01:00
Willy Tarreau	855796bdc8	BUG/MAJOR: list: fix invalid element address calculation Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10 (pre-release). It turns out that constructs such as: while (item != head) { item = LIST_ELEM(item.n); } loop forever, never matching <item> to <head> despite a printf there showing them equal. In practice the problem is that the LIST_ELEM() macro is wrong, it assigns the subtract of two pointers (an integer) to another pointer through a cast to its pointer type. And GCC 10 now considers that this cannot match a pointer and silently optimizes the comparison away. A tested workaround for this is to build with -fno-tree-pta. Note that older gcc versions even with -ftree-pta do not exhibit this rather surprizing behavior. This patch changes the test to instead cast the null-based address to an int to get the offset and subtract it from the pointer, and this time it works. There were just a few places to adjust. Ideally offsetof() should be used but the LIST_ELEM() API doesn't make this trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*) thus it would require to completely change the whole API, which is not something workable in the short term, especially for a backport. With this change, the emitted code is subtly different even on older versions. A code size reduction of ~600 bytes and a total executable size reduction of ~1kB are expected to be observed and should not be taken as an anomaly. Typically this loop in dequeue_proxy_listeners() : while ((listener = MT_LIST_POP(...))) used to produce this code where the comparison is performed on RAX while the new offset is assigned to RDI even though both are always identical: 53ded8: 48 8d 78 c0 lea -0x40(%rax),%rdi 53dedc: 48 83 f8 40 cmp $0x40,%rax 53dee0: 74 39 je 53df1b <dequeue_proxy_listeners+0xab> and now produces this one which is slightly more efficient as the same register is used for both purposes: 53dd08: 48 83 ef 40 sub $0x40,%rdi 53dd0c: 74 2d je 53dd3b <dequeue_proxy_listeners+0x9b> Similarly, retrieving the channel from a stream_interface using si_ic() and si_oc() used to cause this (stream-int in rdi): 1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi) 1cbe: f6 47 04 10 testb $0x10,0x4(%rdi) 1cc2: 74 1c je 1ce0 <si_report_error+0x30> 1cc4: 48 81 ef 00 03 00 00 sub $0x300,%rdi 1ccb: 81 4f 10 00 08 00 00 orl $0x800,0x10(%rdi) and now causes this: 1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi) 1cbe: f6 47 04 10 testb $0x10,0x4(%rdi) 1cc2: 74 1c je 1ce0 <si_report_error+0x30> 1cc4: 81 8f 10 fd ff ff 00 orl $0x800,-0x2f0(%rdi) There is extremely little chance that this fix wakes up a dormant bug as the emitted code effectively does what the source code intends. This must be backported to all supported branches (dropping MT_LIST_ELEM and the spoa_example parts as needed), since the bug is subtle and may not always be visible even when compiling with gcc-10.	2020-03-11 14:12:51 +01:00
Olivier Houchard	1d117e3dcd	BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL; In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable instead of using the provided pointer directly, we shouldn't have changed the code that set the pointer to NULL, as we really want the pointer provided to be nullified, otherwise other parts of the code won't know we just deleted an element, and bad things will happen. This should be backported to 2.1.	2020-03-10 17:45:05 +01:00
Willy Tarreau	9a0dfa5298	CLEANUP: remove the now unused common/syscall.h It was added 9 years ago to implement USE_MY_SPLICE on some libcs where syscall() was bogus. It's about time to get rid of this.	2020-03-10 07:28:46 +01:00
Willy Tarreau	06c63aec95	CLEANUP: remove support for USE_MY_SPLICE The splice() syscall has been supported in glibc since version 2.5 issued in 2006 and is present on supported systems so there's no need for having our own arch-specific syscall definitions anymore.	2020-03-10 07:23:41 +01:00
Willy Tarreau	3858b122a6	CLEANUP: remove support for USE_MY_EPOLL This was made to support epoll on patched 2.4 kernels, and on early 2.6 using alternative libcs thanks to the arch-specific syscall definitions. All the features we support have been around since 2.6.2 and present in glibc since 2.3.2, neither of which are found in field anymore. Let's simply drop this and use epoll normally.	2020-03-10 07:08:10 +01:00
Willy Tarreau	618ac6ea52	CLEANUP: drop support for USE_MY_ACCEPT4 The accept4() syscall has been present for a while now, there is no more reason for maintaining our own arch-specific syscall implementation for systems lacking it in libc but having it in the kernel.	2020-03-10 07:02:46 +01:00
Willy Tarreau	c3e926bf3b	CLEANUP: remove support for Linux i686 vsyscalls This was introduced 10 years ago to squeeze a few CPU cycles per syscall on 32-bit x86 machines and was already quite old by then, requiring to explicitly enable support for this in the kernel. We don't even know if it still builds, let alone if it works at all on recent kernels! Let's completely drop this now.	2020-03-10 06:55:52 +01:00
William Lallemand	6763016866	BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"), HAProxy generates a list of filters based on the sni_ctx in memory. However it's not always relevant, sometimes no filters were configured and the CN/SAN in the new certificate are not the same. This patch fixes the issue by using a flag filters in the ckch_inst, so we are able to know if there were filters or not. In the late case it uses the CN/SAN of the new certificate to generate the sni_ctx. note: filters are still only used in the crt-list atm.	2020-03-09 17:32:04 +01:00
William Lallemand	0a52846603	CLEANUP: ssl: is_default is a bit in ckch_inst The field is_default becomes a bit in the ckch_inst structure.	2020-03-09 17:32:04 +01:00
Miroslav Zagorac	d7dc67ba1d	CLEANUP: remove unused code in 'my_ffsl/my_flsl' functions Shifting the variable 'a' one bit to the right has no effect on the result of the functions.	2020-03-09 14:47:27 +01:00
Willy Tarreau	ee3bcddef7	MINOR: tools: add a generic function to generate UUIDs We currently have two UUID generation functions, one for the sample fetch and the other one in the SPOE filter. Both were a bit complicated since they were made to support random() implementations returning an arbitrary number of bits, and were throwing away 33 bits every 64. Now we don't need this anymore, so let's have a generic function consuming 64 bits at once and use it as appropriate.	2020-03-08 18:04:16 +01:00
Willy Tarreau	52bf839394	BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG This is the replacement of failed attempt to add thread safety and per-process sequences of random numbers initally tried with commit `1c306aa84d` ("BUG/MEDIUM: random: implement per-thread and per-process random sequences"). This new version takes a completely different approach and doesn't try to work around the horrible OS-specific and non-portable random API anymore. Instead it implements "xoroshiro128*", a reputedly high quality random number generator, which is one of the many variants of xorshift, which passes all quality tests and which is described here: http://prng.di.unimi.it/ While not cryptographically secure, it is fast and features a 2^128-1 period. It supports fast jumps allowing to cut the period into smaller non-overlapping sequences, which we use here to support up to 2^32 processes each having their own, non-overlapping sequence of 2^96 numbers (~710^28). This is enough to provide 1 billion randoms per second and per process for 2200 billion years. The implementation was made thread-safe either by using a double 64-bit CAS on platforms supporting it (x86_64, aarch64) or by using a local lock for the time needed to perform the shift operations. This ensures that all threads pick numbers from the same pool so that it is not needed to assign per-thread ranges. For processes we use the fast jump method to advance the sequence by 2^96 for each process. Before this patch, the following config: global nbproc 8 frontend f bind :4445 mode http log stdout format raw daemon log-format "%[uuid] %pid" redirect location / Would produce this output: a4d0ad64-2645-4b74-b894-48acce0669af 12987 a4d0ad64-2645-4b74-b894-48acce0669af 12992 a4d0ad64-2645-4b74-b894-48acce0669af 12986 a4d0ad64-2645-4b74-b894-48acce0669af 12988 a4d0ad64-2645-4b74-b894-48acce0669af 12991 a4d0ad64-2645-4b74-b894-48acce0669af 12989 a4d0ad64-2645-4b74-b894-48acce0669af 12990 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986 (...) And now produces: f94b29b3-da74-4e03-a0c5-a532c635bad9 13011 47470c02-4862-4c33-80e7-a952899570e5 13014 86332123-539a-47bf-853f-8c8ea8b2a2b5 13013 8f9efa99-3143-47b2-83cf-d618c8dea711 13012 3cc0f5c7-d790-496b-8d39-bec77647af5b 13015 3ec64915-8f95-4374-9e66-e777dc8791e0 13009 0f9bf894-dcde-408c-b094-6e0bb3255452 13011 49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014 e23f6f2e-35c5-4433-a294-b790ab902653 13012 There are multiple benefits to using this method. First, it doesn't depend anymore on a non-portable API. Second it's thread safe. Third it is fast and more proven than any hack we could attempt to try to work around the deficiencies of the various implementations around. This commit depends on previous patches "MINOR: tools: add 64-bit rotate operators" and "BUG/MEDIUM: random: initialize the random pool a bit better", all of which will need to be backported at least as far as version 2.0. It doesn't require to backport the build fixes for circular include files dependecy anymore.	2020-03-08 10:09:02 +01:00
Willy Tarreau	7a40909c00	MINOR: tools: add 64-bit rotate operators This adds rotl64/rotr64 to rotate a 64-bit word by an arbitrary number of bits. It's mainly aimed at being used with constants.	2020-03-08 00:42:18 +01:00
Willy Tarreau	0fbf28a05b	Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences" This reverts commit `1c306aa84d`. It breaks the build on all non-glibc platforms. I got confused by the man page (which possibly is the most confusing man page I've ever read about a standard libc function) and mistakenly understood that random_r was portable, especially since it appears in latest freebsd source as well but not in released versions, and with a slightly different API :-/ We need to find a different solution with a fallback. Among the possibilities, we may reintroduce this one with a fallback relying on locking around the standard functions, keeping fingers crossed for no other library function to call them in parallel, or we may also provide our own PRNG, which is not necessarily more difficult than working around the totally broken up design of the portable API.	2020-03-07 11:24:39 +01:00
Willy Tarreau	1c306aa84d	BUG/MEDIUM: random: implement per-thread and per-process random sequences As mentioned in previous patch, the random number generator was never made thread-safe, which used not to be a problem for health checks spreading, until the uuid sample fetch function appeared. Currently it is possible for two threads or processes to produce exactly the same UUID. In fact it's extremely likely that this will happen for processes, as can be seen with this config: global nbproc 8 frontend f bind :4445 mode http log stdout daemon format raw log-format "%[uuid] %pid" redirect location / It typically produces this log: 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646 07764439-c24d-4e6f-a5a6-0138be59e7a8 30645 07764439-c24d-4e6f-a5a6-0138be59e7a8 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643 07764439-c24d-4e6f-a5a6-0138be59e7a8 30646 b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642 07764439-c24d-4e6f-a5a6-0138be59e7a8 30642 What this patch does is to use a distinct per-thread and per-process seed to make sure the same sequences will not appear, and will then extend these seeds by "burning" a number of randoms that depends on the global random seed, the thread ID and the process ID. This adds roughly 20 extra bits of randomness, resulting in 52 bits total per thread and per process. It only takes a few milliseconds to burn these randoms and given that threads start with a different seed, we know they will not catch each other. So these random extra bits are essentially added to ensure randomness between boots and cluster instances. This replaces all uses of random() with ha_random() which uses the thread-local state. This must be backported as far as 2.0 or any version having the UUID sample-fetch function since it's the main victim here. It's important to note that this patch, in addition to depending on the previous one "BUG/MEDIUM: init: initialize the random pool a bit better", also depends on the preceeding build fixes to address a circular dependency issue in the include files that prevented it from building. Part or all of these patches may need to be backported or adapted as well.	2020-03-07 06:11:15 +01:00
Willy Tarreau	6c3a681bd6	BUG/MEDIUM: random: initialize the random pool a bit better Since the UUID sample fetch was created, some people noticed that in certain virtualized environments they manage to get exact same UUIDs on different instances started exactly at the same moment. It turns out that the randoms were only initialized to spread the health checks originally, not to provide "clean" randoms. This patch changes this and collects more randomness from various sources, including existing randoms, /dev/urandom when available, RAND_bytes() when OpenSSL is available, as well as the timing for such operations, then applies a SHA1 on all this to keep a 160 bits random seed available, 32 of which are passed to srandom(). It's worth mentioning that there's no clean way to pass more than 32 bits to srandom() as even initstate() provides an opaque state that must absolutely not be tampered with since known implementations contain state information. At least this allows to have up to 4 billion different sequences from the boot, which is not that bad. Note that the thread safety was still not addressed, which is another issue for another patch. This must be backported to all versions containing the UUID sample fetch function, i.e. as far as 2.0.	2020-03-07 06:11:11 +01:00
Willy Tarreau	5a421a8f49	BUILD: listener: types/listener.h must not include standard.h It's only a type definition, this header is not needed and causes some circular dependency issues.	2020-03-07 06:07:18 +01:00
Willy Tarreau	c7f64e7a58	BUILD: freq_ctr: proto/freq_ctr needs to include common/standard.h This is needed for div_64_32() which is there and currently accidently inherited via global.h!	2020-03-07 06:07:18 +01:00
Willy Tarreau	f23e029409	BUILD: global: must not include common/standard.h but only types/freq_ctr.h This one was accidently inherited and used to work but causes a circular dependency.	2020-03-07 06:07:18 +01:00
Willy Tarreau	8dd0d55efe	BUILD: ssl: include mini-clist.h We use some list definitions and we don't include this header which is in fact accidently inherited from others, causing a circular dependency issue.	2020-03-07 06:07:18 +01:00
Willy Tarreau	a8561db936	BUILD: buffer: types/{ring.h,checks.h} should include buf.h, not buffer.h buffer.h relies on proto/activity because it contains some code and not just type definitions. It must not be included from types files. It should probably also be split in two if it starts to include a proto. This causes some circular dependencies at other places.	2020-03-07 06:07:18 +01:00
Christopher Faulet	d8f0e073dd	MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY This flag was used in some internal functions to be sure the current stream is able to handle HTTP content. It was introduced when the legacy HTTP code was still there. Now, It is possible to rely on stream's flags to be sure we have an HTX stream. So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is replaced by a call to the IS_HTX_STRM() macro. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	1cdceb9365	MINOR: htx: Add a function to return a block at a specific offset The htx_find_offset() function may be used to look for a block at a specific offset in an HTX message, starting from the message head. A compound result is returned, an htx_ret structure, with the found block and the position of the offset in the block. If the offset is ouside of the HTX message, the returned block is NULL.	2020-03-06 14:12:59 +01:00
Christopher Faulet	251f4917c3	MINOR: buf: Add function to insert a string at an absolute offset in a buffer The b_insert_blk() function may now be used to insert a string, given a pointer and the string length, at an absolute offset in a buffer, moving data between this offset and the buffer's tail just after the end of the inserted string. The buffer's length is automatically updated. This function supports wrapping. All the string is copied or nothing. So it returns 0 if there are not enough space to perform the copy. Otherwise, the number of bytes copied is returned.	2020-03-06 14:12:59 +01:00
Carl Henrik Lunde	f91ac19299	OPTIM: startup: fast unique_id allocation for acl. pattern_finalize_config() uses an inefficient algorithm which is a problem with very large configuration files. This affects startup, and therefore reload time. When haproxy is deployed as a router in a Kubernetes cluster the generated configuration file may be large and reloads are frequently occuring, which makes this a significant issue. The old algorithm is O(n^2) * allocate missing uids - O(n^2) * sort linked list - O(n^2) The new algorithm is O(n log n): * find the user allocated uids - O(n) * store them for efficient lookup - O(n log n) * allocate missing uids - n times O(log n) * sort all uids - O(n log n) * convert back to linked list - O(n) Performance examples, startup time in seconds: pat_refs old new 1000 0.02 0.01 10000 2.1 0.04 20000 12.3 0.07 30000 27.9 0.10 40000 52.5 0.14 50000 77.5 0.17 Please backport to 1.8, 2.0 and 2.1.	2020-03-06 08:11:58 +01:00
Tim Duesterhus	a17e66289c	MEDIUM: stream: Make the `unique_id` member of `struct stream` a `struct ist` The `unique_id` member of `struct stream` now is a `struct ist`.	2020-03-05 20:21:58 +01:00
Tim Duesterhus	0643b0e7e6	MINOR: proxy: Make `header_unique_id` a `struct ist` The `header_unique_id` member of `struct proxy` now is a `struct ist`.	2020-03-05 19:58:22 +01:00
Tim Duesterhus	9576ab7640	MINOR: ist: Add `struct ist istdup(const struct ist)` istdup() performs the equivalent of strdup() on a `struct ist`.	2020-03-05 19:53:12 +01:00
Tim Duesterhus	35005d01d2	MINOR: ist: Add `struct ist istalloc(size_t)` and `void istfree(struct ist*)` `istalloc` allocates memory and returns an `ist` with the size `0` that points to this allocation. `istfree` frees the pointed memory and clears the pointer.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	e296d3e5f0	MINOR: ist: Add `int isttest(const struct ist)` `isttest` returns whether the `.ptr` is non-null.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	241e29ef9c	MINOR: ist: Add `IST_NULL` macro `IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and `.len = 0`.	2020-03-05 19:52:07 +01:00
William Lallemand	cfca1422c7	MINOR: ssl: reach a ckch_store from a sni_ctx It was only possible to go down from the ckch_store to the sni_ctx but not to go up from the sni_ctx to the ckch_store. To allow that, 2 pointers were added: - a ckch_inst pointer in the struct sni_ctx - a ckckh_store pointer in the struct ckch_inst	2020-03-05 11:28:42 +01:00
William Lallemand	38df1c8006	MINOR: ssl/cli: support crt-list filters Generate a list of the previous filters when updating a certificate which use filters in crt-list. Then pass this list to the function generating the sni_ctx during the commit. This feature allows the update of the crt-list certificates which uses the filters with "set ssl cert". This function could be probably replaced by creating a new ckch_inst_new_load_store() function which take the previous sni_ctx list as an argument instead of the char **sni_filter, avoiding the allocation/copy during runtime for each filter. But since are still handling the multi-cert bundles, it's better this way to avoid code duplication.	2020-03-05 11:27:53 +01:00
Tim Duesterhus	127a74dd48	MINOR: stream: Add stream_generate_unique_id function Currently unique IDs for a stream are generated using repetitive code in multiple locations, possibly allowing for inconsistent behavior.	2020-03-05 07:23:00 +01:00
Willy Tarreau	899e5f69a1	MINOR: debug: use our own backtrace function on clang+x86_64 A test on FreeBSD with clang 4 to 8 produces this on a call to a spinning loop on the CLI: call trace(5): \| 0x53e2bc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c \| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e with our own function it correctly produces this: call trace(20): \| 0x53e2dc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c \| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e \| 0x800e022bf [48 83 c4 38 5b 41 5c 41]: libthr:pthread_getspecific+0xdef \| 0x7ffffffff003 [48 8d 7c 24 10 6a 00 48]: main+0x7fffffb416f3 \| 0x801373809 [85 c0 0f 84 6f ff ff ff]: libc:__sys_gettimeofday+0x199 \| 0x801373709 [89 c3 85 c0 75 a6 48 8b]: libc:__sys_gettimeofday+0x99 \| 0x801371c62 [83 f8 4e 75 0f 48 89 df]: libc:gettimeofday+0x12 \| 0x51fa0a [48 89 df 4c 89 f6 e8 6b]: ha_thread_dump_all_to_trash+0x49a \| 0x4b723b [85 c0 75 09 49 8b 04 24]: mworker_cli_sockpair_new+0xd9b \| 0x4b6c68 [85 c0 75 08 4c 89 ef e8]: mworker_cli_sockpair_new+0x7c8 \| 0x532f81 [4c 89 e7 48 83 ef 80 41]: task_run_applet+0xe1 So let's add clang+x86_64 to the list of platforms that will use our simplified version. As a bonus it will not require to link with -lexecinfo on FreeBSD and will work out of the box when passing USE_BACKTRACE=1.	2020-03-04 12:04:07 +01:00
Willy Tarreau	13faf16e1e	MINOR: debug: improve backtrace() on aarch64 and possibly other systems It happens that on aarch64 backtrace() only returns one entry (tested with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding the stack due to the risk of hitting a bad pointer. Here we can use may_access() to know when it's safe, so we can actually unwind the stack without taking risks. It happens that the faulting function (the one just after the signal handler) is not listed here, very likely because the signal handler uses a special stack and did not create a new frame. So this patch creates a new my_backtrace() function in standard.h that either calls backtrace() or does its own unrolling. The choice depends on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build target.	2020-03-04 12:04:07 +01:00
Emmanuel Hocdet	842e94ee06	MINOR: ssl: add "ca-verify-file" directive It's only available for bind line. "ca-verify-file" allows to separate CA certificates from "ca-file". CA names sent in server hello message is only compute from "ca-file". Typically, "ca-file" must be defined with intermediate certificates and "ca-verify-file" with certificates to ending the chain, like root CA. Fix issue #404.	2020-03-04 11:53:11 +01:00
Willy Tarreau	eb8b1ca3eb	MINOR: tools: add resolve_sym_name() to resolve function pointers We use various hacks at a few places to try to identify known function pointers in debugging outputs (show threads & show fd). Let's centralize this into a new function dedicated to this. It already knows about the functions matched by "show threads" and "show fd", and when built with USE_DL, it can rely on dladdr1() to resolve other functions. There are some limitations, as static functions are not resolved, linking with -rdynamic is mandatory, and even then some functions will not necessarily appear. It's possible to do a better job by rebuilding the whole symbol table from the ELF headers in memory but it's less portable and the gains are still limited, so this solution remains a reasonable tradeoff.	2020-03-03 18:18:40 +01:00
Willy Tarreau	762fb3ec8e	MINOR: tools: add new function dump_addr_and_bytes() This function dumps <n> bytes from <addr> in hex form into buffer <buf> enclosed in brackets after the address itself, formatted on 14 chars including the "0x" prefix. This is meant to be used as a prefix for code areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: " It relies on may_access() to know if the bytes are dumpable, otherwise "--" is emitted. An optional prefix is supported.	2020-03-03 17:46:37 +01:00
Willy Tarreau	27d00c0167	MINOR: task: export run_tasks_from_list This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	3ebd55ee51	MINOR: haproxy: export run_poll_loop This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1827845a3d	MINOR: haproxy: export main to ease access from debugger Better just export main instead of declaring it as extern, it's cleaner and may be usable elsewhere.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1ed3781e21	MINOR: fd: merge the read and write error bits into RW error We always set them both, which makes sense since errors at the FD level indicate a terminal condition for the socket that cannot be recovered. Usually this is detected via a write error, but sometimes such an error may asynchronously be reported on the read side. Let's simplify this using only the write bit and calling it RW since it's used like this everywhere, and leave the R bit spare for future use.	2020-02-28 07:42:29 +01:00
Willy Tarreau	a135ea63a6	CLEANUP: fd: remove some unneeded definitions of FD_EV_* flags There's no point in trying to be too generic for these flags as the read and write sides will soon differ a bit. Better explicitly define the flags for each direction without trying to be direction-agnostic. this clarifies the code and removes some defines.	2020-02-28 07:42:29 +01:00
Willy Tarreau	f80fe832b1	CLEANUP: fd: remove the FD_EV_STATUS aggregate This was used only by fd_recv_state() and fd_send_state(), both of which are unused. This will not work anymore once recv and send flags start to differ, so let's remove this.	2020-02-28 07:42:29 +01:00
Jerome Magnin	967d3cc105	BUG/MINOR: http_ana: make sure redirect flags don't have overlapping bits commit `c87e46881` ("MINOR: http-rules: Add a flag on redirect rules to know the rule direction") introduced a new flag for redirect rules, but its value has bits in common with REDIRECT_FLAG_DROP_QS, which makes us enter this code path in http_apply_redirect_rule(), which will then drop the query string. To fix this, just give REDIRECT_FLAG_FROM_REQ its own unique value. This must be backported where `c87e468816` is backported. This should fix issue 521.	2020-02-27 23:44:41 +01:00
Willy Tarreau	2104659cd5	MEDIUM: buffer: remove the buffer_wq lock This lock was only needed to protect the buffer_wq list, but now we have the mt_list for this. This patch simply turns the buffer_wq list to an mt_list and gets rid of the lock. It's worth noting that the whole buffer_wait thing still looks totally wrong especially in a threaded context: the wakeup_cb() callback is called synchronously from any thread and may end up calling some connection code that was not expected to run on a given thread. The whole thing should probably be reworked to use tasklets instead and be a bit more centralized.	2020-02-26 10:39:36 +01:00
William Lallemand	e0f3fd5b4c	CLEANUP: ssl: move issuer_chain tree and definition Move the cert_issuer_tree outside the global_ssl structure since it's not a configuration variable. And move the declaration of the issuer_chain structure in types/ssl_sock.h	2020-02-25 15:06:40 +01:00
Willy Tarreau	226ef26056	MINOR: compiler: add new alignment macros This commit adds ALWAYS_ALIGN(), MAYBE_ALIGN() and ATOMIC_ALIGN() to be placed as delimitors inside structures to force alignment to a given size. These depend on the architecture's capabilities so that it is possible to always align, align only on archs not supporting unaligned accesses at all, or only on those not supporting them for atomic accesses (e.g. before a lock).	2020-02-25 10:34:43 +01:00
Willy Tarreau	908071171b	BUILD: general: always pass unsigned chars to is* functions The isalnum(), isalpha(), isdigit() etc functions from ctype.h are supposed to take an int in argument which must either reflect an unsigned char or EOF. In practice on some platforms they're implemented as macros referencing an array, and when passed a char, they either cause a warning "array subscript has type 'char'" when lucky, or cause random segfaults when unlucky. It's quite unconvenient by the way since none of them may return true for negative values. The recent introduction of cygwin to the list of regularly tested build platforms revealed a lot of breakage there due to the same issues again. So this patch addresses the problem all over the code at once. It adds unsigned char casts to every valid use case, and also drops the unneeded double cast to int that was sometimes added on top of it. It may be backported by dropping irrelevant changes if that helps better support uncommon platforms. It's unlikely to fix bugs on platforms which would already not emit any warning though.	2020-02-25 08:16:33 +01:00
Willy Tarreau	03e7853581	BUILD: remove obsolete support for -mregparm / USE_REGPARM This used to be a minor optimization on ix86 where registers are scarce and the calling convention not very efficient, but this platform is not relevant enough anymore to warrant all this dirt in the code for the sake of saving 1 or 2% of performance. Modern platforms don't use this at all since their calling convention already defaults to using several registers so better get rid of this once for all.	2020-02-25 07:41:47 +01:00
Tim Duesterhus	1d48ba91d7	CLEANUP: net_helper: Do not negate the result of unlikely This patch turns the double negation of 'not unlikely' into 'likely' and then turns the negation of 'not smaller' into 'greater or equal' in an attempt to improve readability of the condition. [wt: this was not a bug but purposely written like this to improve code generation on older compilers but not needed anymore as described here: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]	2020-02-25 07:30:49 +01:00
Tim Duesterhus	927063b892	CLEANUP: conn: Do not pass a pointer to likely Move the `!` inside the likely and negate it to unlikely. The previous version should not have caused issues, because it is converted to a boolean / integral value before being passed to __builtin_expect(), but it's certainly unusual. [wt: this was not a bug but purposely written like this to improve code generation on older compilers but not needed anymore as described here: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]	2020-02-25 07:30:49 +01:00
Willy Tarreau	89ee79845c	MINOR: compiler: drop special cases of likely/unlikely for older compilers We used to special-case the likely()/unlikely() macros for a series of early gcc 4.x compilers which used to produce very bad code when using __builtin_expect(x,1), which basically used to build an integer (0 or 1) from a condition then compare it to integer 1. This was already fixed in 5.x, but even now, looking at the code produced by various flavors of 4.x this bad behavior couldn't be witnessed anymore. So let's consider it as fixed by now, which will allow to get rid of some ugly tricks at some specific places. A test on 4.7.4 shows that the code shrinks by about 3kB now, thanks to some tests being inlined closer to the call place and the unlikely case being moved to real functions. See the link below for more background on this. Link: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html	2020-02-25 07:29:55 +01:00
Willy Tarreau	0e2686762f	MINOR: compiler: move CPU capabilities definition from config.h and complete them These ones are irrelevant to the config but rather to the platform, and as such are better placed in compiler.h. Here we take the opportunity for declaring a few extra capabilities: - HA_UNALIGNED : CPU supports unaligned accesses - HA_UNALIGNED_LE : CPU supports unaligned accesses in little endian - HA_UNALIGNED_FAST : CPU supports fast unaligned accesses - HA_UNALIGNED_ATOMIC : CPU supports unaligned accesses in atomics This will help remove a number of #ifdefs with arch-specific statements.	2020-02-21 16:32:57 +01:00
Jerome Magnin	9dde0b2d31	MINOR: ist: add an iststop() function Add a function that finds a character in an ist and returns an updated ist with the length of the portion of the original string that doesn't contain the char. Might be backported to 2.1	2020-02-21 11:47:25 +01:00
Willy Tarreau	716bec2dc6	MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE This flag is currently supported by raw_sock to perform a single recv() attempt and avoid subscribing. Typically on the request and response paths with keep-alive, with short messages we know that it's very likely that the first message is enough.	2020-02-21 11:22:45 +01:00
Willy Tarreau	5d4d1806db	CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv} This marks the end of the transition from the connection polling states introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9. The socket layer can now safely use its FD while all upper layers rely exclusively on subscriptions. These old functions were removed. Some may deserve some renaming to improved clarty though. The single call to conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling() which already does the same.	2020-02-21 11:21:12 +01:00
Willy Tarreau	d1d14c3157	MINOR: connection: remove the last calls to conn_xprt_{want,stop}_* The last few calls to conn_xprt_{want,stop}_{recv,send} in the central connection code were replaced with their strictly exact equivalent fd_*, adding the call to conn_ctrl_ready() when it was missing.	2020-02-21 11:21:12 +01:00
Willy Tarreau	19bc201c9f	MEDIUM: connection: remove the intermediary polling state from the connection Historically we used to require that the connections held the desired polling states for the data layer and the socket layer. Then with muxes these were more or less merged into the transport layer, and now it happens that with all transport layers having their own state, the "transport layer state" as we have it in the connection (XPRT_RD_ENA, XPRT_WR_ENA) is only an exact copy of the undelying file descriptor state, but with a delay. All of this is causing some difficulties at many places in the code because there are still some locations which use the conn_want_* API to remain clean and only rely on connection, and count on a later collection call to conn_cond_update_polling(), while others need an immediate action and directly use the FD updates. Since our updates are now much cheaper, most of them being only an atomic test-and-set operation, and since our I/O callbacks are deferred, there's no benefit anymore in trying to "cache" the transient state change in the connection flags hoping to cancel them before they become an FD event. Better make such calls transparent indirections to the FD layer instead and get rid of the deferred operations which needlessly complicate the logic inside. This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE. A number of functions related to polling updates were either greatly simplified or removed. Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data were expected to be sent after a PROXY protocol or SOCKSv4 header. These ones were simply replaced with a check on the subscription which is where we ought to get the autoritative information from. Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts are the same. conn_stop_polling() and conn_xprt_stop_both() are the same as well. conn_cond_update_polling() only causes errors to stop polling. It also becomes way more obvious that muxes should not at all employ conn_xprt_{want\|stop}_{recv,send}(), and that the call to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer is inappropriate, it ought to unsubscribe from reads instead. All of this definitely requires a serious cleanup.	2020-02-21 11:21:12 +01:00
Christopher Faulet	727a3f1ca3	MINOR: http-htx: Add a function to retrieve the headers size of an HTX message http_get_hdrs_size() function may now be used to get the bytes held by headers in an HTX message. It only works if the headers were not already forwarded. Metadata are not counted here.	2020-02-18 11:19:57 +01:00
Willy Tarreau	a71667c07d	BUG/MINOR: tools: also accept '+' as a valid character in an identifier The function is_idchar() was added by commit `36f586b` ("MINOR: tools: add is_idchar() to tell if a char may belong to an identifier") to ease matching of sample fetch/converter names. But it lacked support for the '+' character used in "base32+src" and "url32+src". A quick way to figure the list of supported sample fetch+converter names is to issue the following command: git grep '"[^"]",.SMP_T_.*SMP_USE_'\|cut -f2 -d'"'\|sort -u No more entry is reported once searching for characters not covered by is_idchar(). No backport is needed.	2020-02-17 06:37:40 +01:00
Willy Tarreau	e3b57bf92f	MINOR: sample: make sample_parse_expr() able to return an end pointer When an end pointer is passed, instead of complaining that a comma is missing after a keyword, sample_parse_expr() will silently return the pointer to the current location into this return pointer so that the caller can continue its parsing. This will be used by more complex expressions which embed sample expressions, and may even permit to embed sample expressions into arguments of other expressions.	2020-02-14 19:02:06 +01:00
Willy Tarreau	80b53ffb1c	MEDIUM: arg: make make_arg_list() stop after its own arguments The main problem we're having with argument parsing is that at the moment the caller looks for the first character looking like an end of arguments (')') and calls make_arg_list() on the sub-string inside the parenthesis. Let's first change the way it works so that make_arg_list() also consumes the parenthesis and returns the pointer to the first char not consumed. This will later permit to refine each argument parsing. For now there is no functional change.	2020-02-14 19:02:06 +01:00
Willy Tarreau	d4ad669051	MINOR: chunk: implement chunk_strncpy() to copy partial strings This does like chunk_strcpy() except that the maximum string length may be limited by the caller. A trailing zero is always appended. This is particularly handy to extract portions of strings to put into the trash for use with libc functions requiring a nul-terminated string.	2020-02-14 19:02:06 +01:00
Willy Tarreau	36f586b694	MINOR: tools: add is_idchar() to tell if a char may belong to an identifier This function will simply be used to find the end of config identifiers (proxies, servers, ACLs, sample fetches, converters, etc).	2020-02-14 19:02:06 +01:00
Ilya Shipitsin	88a2f0304c	CLEANUP: ssl: remove unused functions in openssl-compat.h functions SSL_SESSION_get0_id_context, SSL_CTX_get_default_passwd_cb, SSL_CTX_get_default_passwd_cb_userdata are not used anymore	2020-02-14 16:15:00 +01:00
Willy Tarreau	160ad9e38a	CLEANUP: mini-clist: simplify nested do { while(1) {} } while (0) While looking for other occurrences of do { continue; } while (0) I found these few leftovers in mini-clist where an outer loop was made around "do { } while (0)" then another loop was placed inside just to handle the continue. Let's clean this up by just removing the outer one. Most of the patch is only the inner part of the loop that is reindented. It was verified that the resulting code is the same.	2020-02-11 10:27:04 +01:00
Christopher Faulet	7716cdf450	MINOR: lua: Get the action return code on the stack when an action finishes When an action successfully finishes, the action return code (ACT_RET_*) is now retrieve on the stack, ff the first element is an integer. In addition, in hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before exiting. Thus, when a script uses this function, the corresponding action still finishes with the good code. Thanks to this change, the flag HLUA_STOP is now useless. So it has been removed. It is a mandatory step to allow a lua action to return any action return code.	2020-02-06 15:13:03 +01:00
Christopher Faulet	07a718e712	CLEANUP: lua: Remove consistency check for sample fetches and actions It is not possible anymore to alter the HTTP parser state from lua sample fetches or lua actions. So there is no reason to still check for the parser state consistency.	2020-02-06 15:13:03 +01:00
Christopher Faulet	4a2c142779	MEDIUM: http-rules: Support extra headers for HTTP return actions It is now possible to append extra headers to the generated responses by HTTP return actions, while it is not based on an errorfile. For return actions based on errorfiles, these extra headers are ignored. To define an extra header, a "hdr" argument must be used with a name and a value. The value is a log-format string. For instance: http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"	2020-02-06 15:13:03 +01:00
Christopher Faulet	24231ab61f	MEDIUM: http-rules: Add the return action to HTTP rules Thanks to this new action, it is now possible to return any responses from HAProxy, with any status code, based on an errorfile, a file or a string. Unlike the other internal messages generated by HAProxy, these ones are not interpreted as errors. And it is not necessary to use a file containing a full HTTP response, although it is still possible. In addition, using a log-format string or a log-format file, it is possible to have responses with a dynamic content. This action can be used on the request path or the response path. The only constraint is to have a responses smaller than a buffer. And to avoid any warning the buffer space reserved to the headers rewritting should also be free. When a response is returned with a file or a string as payload, it only contains the content-length header and the content-type header, if applicable. Here are examples: http-request return content-type image/x-icon file /var/www/favicon.ico \ if { path /favicon.ico } http-request return status 403 content-type text/plain \ lf-string "Access denied. IP %[src] is blacklisted." \ if { src -f /etc/haproxy/blacklist.lst }	2020-02-06 15:12:54 +01:00
Christopher Faulet	6d0c3dfac6	MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding This patch introduces the 'http-after-response' rules. These rules are evaluated at the end of the response analysis, just before the data forwarding, on ALL HTTP responses, the server ones but also all responses generated by HAProxy. Thanks to this ruleset, it is now possible for instance to add some headers to the responses generated by the stats applet. Following actions are supported : * allow * add-header * del-header * replace-header * replace-value * set-header * set-status * set-var * strict-mode * unset-var	2020-02-06 14:55:34 +01:00
Christopher Faulet	ef70e25035	MINOR: http-ana: Add a function for forward internal responses Operations performed when internal responses (redirect/deny/auth/errors) are returned are always the same. The http_forward_proxy_resp() function is added to group all of them under a unique function.	2020-02-06 14:55:34 +01:00
Christopher Faulet	72c7d8d040	MINOR: http-ana: Rely on http_reply_and_close() to handle server error The http_server_error() function now relies on http_reply_and_close(). Both do almost the same actions. In addtion, http_server_error() sets the error flag and the final state flag on the stream.	2020-02-06 14:55:34 +01:00
Christopher Faulet	c87e468816	MINOR: http-rules: Add a flag on redirect rules to know the rule direction HTTP redirect rules can be evaluated on the request or the response path. So when a redirect rule is evaluated, it is important to have this information because some specific processing may be performed depending on the direction. So the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the redirect rule during the parsing. This patch is mandatory to fix a bug on redirect rule. It must be backported to all stable versions.	2020-02-06 14:55:34 +01:00
Christopher Faulet	a4168434a7	MINOR: dns: Dynamically allocate dns options to reduce the act_rule size <.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated when a do-resolve rule is parsed. This drastically reduces the structure size.	2020-02-06 14:55:34 +01:00
Christopher Faulet	7651362e52	MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer The channel_htx_copy_msg() function can now be used to copy an HTX message in a channel's buffer. This function takes care to not overwrite existing data. This patch depends on the commit "MINOR: htx: Add a function to append an HTX message to another one". Both are mandatory to fix a bug in http_reply_and_close() function. Be careful to backport both first.	2020-02-06 14:55:16 +01:00
Christopher Faulet	0ea0c86753	MINOR: htx: Add a function to append an HTX message to another one the htx_append_msg() function can now be used to append an HTX message to another one. All the message is copied or nothing. If an error occurs during the copy, all changes are rolled back. This patch is mandatory to fix a bug in http_reply_and_close() function. Be careful to backport it first.	2020-02-06 14:54:47 +01:00
Olivier Houchard	1c7c0d6b97	BUG/MAJOR: memory: Don't forget to unlock the rwlock if the pool is empty. In __pool_get_first(), don't forget to unlock the pool lock if the pool is empty, otherwise no writer will be able to take the lock, and as it is done when reloading, it leads to an infinite loop on reload. This should be backported with commit `04f5fe87d3`	2020-02-03 13:05:31 +01:00
Olivier Houchard	04f5fe87d3	BUG/MEDIUM: memory: Add a rwlock before freeing memory. When using lockless pools, add a new rwlock, flush_pool. read-lock it when getting memory from the pool, so that concurrenct access are still authorized, but write-lock it when we're about to free memory, in pool_flush() and pool_gc(). The problem is, when removing an item from the pool, we unreference it to get the next one, however, that pointer may have been free'd in the meanwhile, and that could provoke a crash if the pointer has been unmapped. It should be OK to use a rwlock, as normal operations will still be able to access the pool concurrently, and calls to pool_flush() and pool_gc() should be pretty rare. This should be backported to 2.1, 2.0 and 1.9.	2020-02-01 18:08:34 +01:00
Willy Tarreau	b30a153cd1	MINOR: task: detect self-wakeups on tl==sched->current instead of TASK_RUNNING This is exactly what we want to detect (a task/tasklet waking itself), so let's use the proper condition for this.	2020-01-31 17:45:10 +01:00
Willy Tarreau	bb238834da	MINOR: task: permanently flag tasklets waking themselves up Commit `a17664d829` ("MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet") tried to inflict a penalty to self-requeuing tasks/tasklets which correspond to those involved in large, high-latency data transfers, for the benefit of all other processing which requires a low latency. However, it turns out that while it ought to do this on a case-by-case basis, basing itself on the RUNNING flag isn't accurate because this flag doesn't leave for tasklets, so we'd rather need a distinct flag to tag such tasklets. This commit introduces TASK_SELF_WAKING to mark tasklets acting like this. For now it's still set when TASK_RUNNING is present but this will have to change. The flag is kept across wakeups.	2020-01-31 17:45:10 +01:00
Willy Tarreau	a17664d829	MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet When a tasklet re-runs itself such as in this chain: si_cs_io_cb -> si_cs_process -> si_notify -> si_chk_rcv then we know it can easily clobber the run queue and harm latency. Now what the scheduler does when it detects this is that such a tasklet is automatically placed into the bulk list so that it's processed with the remaining CPU bandwidth only. Thanks to this the CLI becomes instantly responsive again even under heavy stress at 50 Gbps over 40kcon and 100% CPU on 16 threads.	2020-01-30 19:03:31 +01:00
Willy Tarreau	a62917b890	MEDIUM: tasks: implement 3 different tasklet classes with their own queues We used to mix high latency tasks and low latency tasklets in the same list, and to even refill bulk tasklets there, causing some unfairness in certain situations (e.g. poll-less transfers between many connections saturating the machine with similarly-sized in and out network interfaces). This patch changes the mechanism to split the load into 3 lists depending on the task/tasklet's desired classes : - URGENT: this is mainly for tasklets used as deferred callbacks - NORMAL: this is for regular tasks - BULK: this is for bulk tasks/tasklets Arbitrary ratios of max_processed are picked from each of these lists in turn, with the ability to complete in one list from what was not picked in the previous one. After some quick tests, the following setup gave apparently good results both for raw TCP with splicing and for H2-to-H1 request rate: - 0 to 75% for urgent - 12 to 50% for normal - 12 to what remains for bulk Bulk is not used yet.	2020-01-30 18:59:33 +01:00
Willy Tarreau	911db9bd29	MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE As mentioned in commit `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of consistency on which flags are checked among L4/L6/HANDSHAKE depending on the code areas. A number of sample fetch functions only check for L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and many check both L4L6 and HANDSHAKE. This patch starts to make all of this more consistent by introducing a new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and reports whether the transport layer is ready or not. All inconsistent call places were updated to rely on this one each time the goal was to check for the readiness of the transport layer.	2020-01-23 16:34:26 +01:00
Willy Tarreau	4450b587dd	MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE Most places continue to check CO_FL_HANDSHAKE while in fact they should check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one dedicated to SSL renegotiation. In fact the SSL layer should be the only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data when a renegotiation is in progress, but other ones randomly include it without knowing. And ideally it should even be an internal flag that's not exposed in the connection. This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL. In order to limit the confusion that has accumulated over time, the CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake, possibly used by a renegotiation was moved after the other ones.	2020-01-23 16:34:26 +01:00
Willy Tarreau	c192b0ab95	MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_* Commit `477902bd2e` ("MEDIUM: connections: Get ride of the xprt_done callback.") broke the master CLI for a very obscure reason. It happens that short requests immediately terminated by a shutdown are properly received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not yet set on the connection since we've not passed again through conn_fd_handler() and it was not done in conn_complete_session(). While commit `a8a415d31a` ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session()") fixed the issue, such accident may happen again as the root cause is deeper and actually comes down to the fact that CO_FL_CONNECTED is lazily set at various check points in the code but not every time we drop one wait bit. It is not the first time we face this situation. Originally this flag was used to detect the transition between WAIT_* and CONNECTED in order to call ->wake() from the FD handler. But since at least 1.8-dev1 with commit `7bf3fa3c23` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is always synchronized against the two others before being checked. Moreover, with the I/Os moved to tasklets, the decision to call the ->wake() function is performed after the I/Os in si_cs_process() and equivalent, which don't care about this transition either. So in essence, checking for CO_FL_CONNECTED has become a lazy wait to check for (CO_FL_WAIT_L4_CONN \| CO_FL_WAIT_L6_CONN), but that always relies on someone else having synchronized it. This patch addresses it once for all by killing this flag and only checking the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This revealed a number of inconsistencies that were purposely not addressed here for the sake of bisectability: - while most places do check both L4+L6 and HANDSHAKE at the same time, some places like assign_server() or back_handle_st_con() and a few sample fetches looking for proxy protocol do check for L4+L6 but don't care about HANDSHAKE ; these ones will probably fail on TCP request session rules if the handshake is not complete. - some handshake handlers do validate that a connection is established at L4 but didn't clear CO_FL_WAIT_L4_CONN - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6 before declaring the mux ready while the snd_buf function also checks for the handshake's completion. Likely the former should validate the handshake as well and we should get rid of these extra tests in snd_buf. - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only later clear CO_FL_WAIT_L4_CONN. - xprt_handshake would set CO_FL_CONNECTED itself without actually clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if waiting for a pure Rx handshake. - most places in ssl_sock that were checking CO_FL_CONNECTED don't need to include the L4 check as an L6 check is enough to decide whether to wait for more info or not. It also becomes obvious when reading the test in si_cs_recv() that caused the failure mentioned above that once converted it doesn't make any sense anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete cannot happen since for CS_FL_EOS to be set, the other ones must have been validated. Some of these parts will still deserve further cleanup, and some of the observations above may induce some backports of potential bug fixes once totally analyzed in their context. The risk of breaking existing stuff is too high to blindly backport everything.	2020-01-23 14:41:37 +01:00
Olivier Houchard	477902bd2e	MEDIUM: connections: Get ride of the xprt_done callback. The xprt_done_cb callback was used to defer some connection initialization until we're connected and the handshake are done. As it mostly consists of creating the mux, instead of using the callback, introduce a conn_create_mux() function, that will just call conn_complete_session() for frontend, and create the mux for backend. In h2_wake(), make sure we call the wake method of the stream_interface, as we no longer wakeup the stream task.	2020-01-22 18:56:05 +01:00
Olivier Houchard	8af03b396a	MEDIUM: streams: Always create a conn_stream in connect_server(). In connect_server(), when creating a new connection for which we don't yet know the mux (because it'll be decided by the ALPN), instead of associating the connection to the stream_interface, always create a conn_stream. This way, we have less special-casing needed. Store the conn_stream in conn->ctx, so that we can reach the upper layers if needed.	2020-01-22 18:55:59 +01:00
Emmanuel Hocdet	6b5b44e10f	BUG/MINOR: ssl: ssl_sock_load_pem_into_ckch is not consistent "set ssl cert <filename> <payload>" CLI command should have the same result as reload HAproxy with the updated pem file (<filename>). Is not the case, DHparams/cert-chain is kept from the previous context if no DHparams/cert-chain is set in the context (<payload>). This patch should be backport to 2.1	2020-01-22 15:55:55 +01:00
Adis Nezirovic	1a693fc2fd	MEDIUM: cli: Allow multiple filter entries for "show table" For complex stick tables with many entries/columns, it can be beneficial to filter using multiple criteria. The maximum number of filter entries can be controlled by defining STKTABLE_FILTER_LEN during build time. This patch can be backported to older releases.	2020-01-22 14:33:17 +01:00
Ilya Shipitsin	056c629531	BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x while working on issue #429, I encountered build failures with various non-released openssl versions, let us improve ssl defines, switch to features, not versions, for EVP_CTRL_AEAD_SET_IVLEN and EVP_CTRL_AEAD_SET_TAG. No backport is needed as there is no valid reason to build a stable haproxy version against a development version of openssl.	2020-01-22 07:54:52 +01:00
Willy Tarreau	2086365f51	CLEANUP: pattern: remove the pat_time definition It was inherited from acl_time, introduced in 1.3.10 by commit `a84d374367` ("[MAJOR] new framework for generic ACL support") and was never ever used. Let's simply drop it now.	2020-01-22 07:44:36 +01:00
Tim Duesterhus	6a0dd73390	CLEANUP: Consistently `unsigned int` for bitfields Signed bitfields of size `1` hold the values `0` and `-1`, but are usually assigned `1`, possibly leading to subtle bugs when the value is explicitely compared against `1`.	2020-01-22 07:28:39 +01:00
Baptiste Assmann	13a9232ebc	MEDIUM: dns: use Additional records from SRV responses Most DNS servers provide A/AAAA records in the Additional section of a response, which correspond to the SRV records from the Answer section: ;; QUESTION SECTION: ;_http._tcp.be1.domain.tld. IN SRV ;; ANSWER SECTION: _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A1.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A8.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A5.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A6.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A4.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A3.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A2.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A7.domain.tld. ;; ADDITIONAL SECTION: A1.domain.tld. 3600 IN A 192.168.0.1 A8.domain.tld. 3600 IN A 192.168.0.8 A5.domain.tld. 3600 IN A 192.168.0.5 A6.domain.tld. 3600 IN A 192.168.0.6 A4.domain.tld. 3600 IN A 192.168.0.4 A3.domain.tld. 3600 IN A 192.168.0.3 A2.domain.tld. 3600 IN A 192.168.0.2 A7.domain.tld. 3600 IN A 192.168.0.7 SRV record support was introduced in HAProxy 1.8 and the first design did not take into account the records from the Additional section. Instead, a new resolution is associated to each server with its relevant FQDN. This behavior generates a lot of DNS requests (1 SRV + 1 per server associated). This patch aims at fixing this by: - when a DNS response is validated, we associate A/AAAA records to relevant SRV ones - set a flag on associated servers to prevent them from running a DNS resolution for said FADN - update server IP address with information found in the Additional section If no relevant record can be found in the Additional section, then HAProxy will failback to running a dedicated resolution for this server, as it used to do. This behavior is the one described in RFC 2782.	2020-01-22 07:19:54 +01:00
Christopher Faulet	2f5339079b	MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive It is now possible to insert any attribute when a cookie is inserted by HAProxy. Any value may be set, no check is performed except the syntax validity (CTRL chars and ';' are forbidden). For instance, it may be used to add the SameSite attribute: cookie SRV insert attr "SameSite=Strict" The attr option may be repeated to add several attributes. This patch should fix the issue #361.	2020-01-22 07:18:31 +01:00
Christopher Faulet	554c0ebffd	MEDIUM: http-rules: Support an optional error message in http deny rules It is now possible to set the error message to use when a deny rule is executed. It may be a specific error file, adding "errorfile <file>" : http-request deny deny_status 400 errorfile /etc/haproxy/errorfiles/400badreq.http It may also be an error file from an http-errors section, adding "errorfiles <name>" : http-request deny errorfiles my-errors # use 403 error from "my-errors" section When defined, this error message is set in the HTTP transaction. The tarpit rule is also concerned by this change.	2020-01-20 15:18:46 +01:00
Christopher Faulet	473e880a25	MINOR: http-ana: Add an error message in the txn and send it when defined It is now possible to set the error message to return to client in the HTTP transaction. If it is defined, this error message is used instead of proxy's errors or default errors.	2020-01-20 15:18:46 +01:00
Christopher Faulet	76edc0f29c	MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy It is now possible to import in a proxy, fully or partially, error files declared in an http-errors section. It may be done using the "errorfiles" directive, followed by a name and optionally a list of status code. If there is no status code specified, all error files of the http-errors section are imported. Otherwise, only error files associated to the listed status code are imported. For instance : http-errors my-errors errorfile 400 ... errorfile 403 ... errorfile 404 ... frontend frt errorfiles my-errors 403 404 # ==> error 400 not imported	2020-01-20 15:18:46 +01:00
Christopher Faulet	35cd81d363	MINOR: http-htx: Add a new section to create groups of custom HTTP errors A new section may now be declared in the configuration to create global groups of HTTP errors. These groups are not linked to a proxy and are referenced by name. The section must be declared using the keyword "http-errors" followed by the group name. This name must be unique. A list of "errorfile" directives may be declared in such section. For instance: http-errors website-1 errorfile 400 /path/to/site1/400.http errorfile 404 /path/to/site1/404.http http-errors website-2 errorfile 400 /path/to/site2/400.http errorfile 404 /path/to/site2/404.http For now, it is just possible to create "http-errors" sections. There is no documentation because these groups are not used yet.	2020-01-20 15:18:46 +01:00
Christopher Faulet	5885775de1	MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages All custom HTTP errors are now stored in a global tree. Proxies use a references on these messages. The key used for errorfile directives is the file name as specified in the configuration. For errorloc directives, a key is created using the redirect code and the url. This means that the same custom error message is now stored only once. It may be used in several proxies or for several status code, it is only parsed and stored once.	2020-01-20 15:18:46 +01:00
Christopher Faulet	bdf6526e94	MINOR: http-htx: Add functions to create HTX redirect message http_parse_errorloc() may now be used to create an HTTP 302 or 303 redirect message with a specific url passed as parameter. A parameter is used to known if it is a 302 or a 303 redirect. A status code is passed as parameter. It must be one of the supported HTTP error codes to be valid. Otherwise an error is returned. It aims to be used to parse "errorloc" directives. It relies on http_load_errormsg() to do most of the job, ie converting it in HTX.	2020-01-20 15:18:45 +01:00
Christopher Faulet	5031ef58ca	MINOR: http-htx: Add functions to read a raw error file and convert it in HTX http_parse_errorfile() may now be used to parse a raw HTTP message from a file. A status code is passed as parameter. It must be one of the supported HTTP error codes to be valid. Otherwise an error is returned. It aims to be used to parse "errorfile" directives. It relies on http_load_errorfile() to do most of the job, ie reading the file content and converting it in HTX.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d73b96d48c	MINOR: tcp-rules: Make tcp-request capture a custom action Now, this action is use its own dedicated function and is no longer handled "in place" during the TCP rules evaluation. Thus the action name ACT_TCP_CAPTURE is removed. The action type is set to ACT_CUSTOM and a check function is used to know if the rule depends on request contents while there is no inspect-delay.	2020-01-20 15:18:45 +01:00
Christopher Faulet	ac98d81f46	MINOR: http-rule/tcp-rules: Make track-sc* custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the TCP/HTTP rules evaluation. Thus the action names ACT_ACTION_TRK_SC0 and ACT_ACTION_TRK_SCMAX are removed. The action type is now the tracking index. Thus the function trk_idx() is no longer needed.	2020-01-20 15:18:45 +01:00
Christopher Faulet	91b3ec13c6	MEDIUM: http-rules: Make early-hint custom actions Now, the early-hint action uses its own dedicated action and is no longer handled "in place" during the HTTP rules evaluation. Thus the action name ACT_HTTP_EARLY_HINT is removed. In additionn, http_add_early_hint_header() and http_reply_103_early_hints() are also removed. This part is now handled in the new action_ptr callback function.	2020-01-20 15:18:45 +01:00
Christopher Faulet	046cf44f6c	MINOR: http-rules: Make set/del-map and add/del-acl custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP__ACL and ACT_HTTP__MAP are removed. The action type is now mapped as following: 0 = add-acl, 1 = set-map, 2 = del-acl and 3 = del-map.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d1f27e3394	MINOR: http-rules: Make set-header and add-header custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_SET_HDR and ACT_HTTP_ADD_VAL are removed. The action type is now set to 0 to set a header (so remove existing ones if any and add a new one) or to 1 to add a header (add without remove).	2020-01-20 15:18:45 +01:00
Christopher Faulet	92d34fe38d	MINOR: http-rules: Make replace-header and replace-value custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_REPLACE_HDR and ACT_HTTP_REPLACE_VAL are removed. The action type is now set to 0 to evaluate the whole header or to 1 to evaluate every comma-delimited values. The function http_transform_header_str() is renamed to http_replace_hdrs() to be more explicit and the function http_transform_header() is removed. In fact, this last one is now more or less the new action function. The lua code has been updated accordingly to use http_replace_hdrs().	2020-01-20 15:18:45 +01:00
Christopher Faulet	006f6507d7	MINOR: actions: Use an integer to set the action type <action> field in the act_rule structure is now an integer. The act_name values are used for all actions without action function (but it is not a pre-requisit though) or the action will have no effect. But for all other actions, any integer value may used, only the action function will take care of it. The default for such actions is ACT_CUSTOM.	2020-01-20 15:18:45 +01:00
Christopher Faulet	245cf795c1	MINOR: actions: Add flags to configure the action behaviour Some flags can now be set on an action when it is registered. The flags are defined in the act_flag enum. For now, only ACT_FLAG_FINAL may be set on an action to specify if it stops the rules evaluation. It is set on ACT_ACTION_ALLOW, ACT_ACTION_DENY, ACT_HTTP_REQ_TARPIT, ACT_HTTP_REQ_AUTH, ACT_HTTP_REDIR and ACT_TCP_CLOSE actions. But, when required, it may also be set on custom actions. Consequently, this flag is checked instead of the action type during the configuration parsing to trigger a warning when a rule inhibits all the following ones.	2020-01-20 15:18:45 +01:00
Christopher Faulet	105ba6cc54	MINOR: actions: Rename the act_flag enum into act_opt The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the action flags for the actions configuration.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cd26e8a2ec	MINOR: http-rules/tcp-rules: Call the defined action function first if defined When TCP and HTTP rules are evaluated, if an action function (action_ptr field in the act_rule structure) is defined for a given action, it is now always called in priority over the test on the action type. Concretly, for now, only custom actions define it. Thus there is no change. It just let us the choice to extend the action type beyond the existing ones in the enum.	2020-01-20 15:18:45 +01:00
Christopher Faulet	96bff76087	MINOR: actions: Regroup some info about HTTP rules in the same struct Info used by HTTP rules manipulating the message itself are splitted in several structures in the arg union. But it is possible to group all of them in a unique struct. Now, <arg.http> is used by most of these rules, which contains: * <arg.http.i> : an integer used as status code, nice/tos/mark/loglevel or action id. * <arg.http.str> : an IST used as header name, reason string or auth realm. * <arg.http.fmt> : a log-format compatible expression * <arg.http.re> : a regular expression used by replace rules	2020-01-20 15:18:45 +01:00
Christopher Faulet	58b3564fde	MINOR: actions: Add a function pointer to release args used by actions Arguments used by actions are never released during HAProxy deinit. Now, it is possible to specify a function to do so. ".release_ptr" field in the act_rule structure may be set during the configuration parsing to a specific deinit function depending on the action type.	2020-01-20 15:18:45 +01:00
Christopher Faulet	e00d06c99f	MINOR: http-rules: Handle all message rewrites the same way In HTTP rules, error handling during a rewrite is now handle the same way for all rules. First, allocation errors are reported as internal errors. Then, if soft rewrites are allowed, rewrite errors are ignored and only the failed_rewrites counter is incremented. Otherwise, when strict rewrites are mandatory, interanl errors are returned. For now, only soft rewrites are supported. Note also that the warning sent to notify a rewrite failure was removed. It will be useless once the strict rewrites will be possible.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a00071e2e5	MINOR: http-ana: Add a txn flag to support soft/strict message rewrites the HTTP_MSGF_SOFT_RW flag must now be set on the HTTP transaction to ignore rewrite errors on a message, from HTTP rules. The mode is called the soft rewrites. If thes flag is not set, strict rewrites are performed. In this mode, if a rewrite error occurred, an internal error is reported. For now, HTTP_MSGF_SOFT_RW is always set and there is no way to switch a transaction in strict mode.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a08546bb5a	MINOR: counters: Remove failed_secu counter and use denied_resp instead The failed_secu counter is only used for the servers stats. It is used to report the number of denied responses. On proxies, the same info is stored in the denied_resp counter. So, it is more consistent to use the same field for servers.	2020-01-20 15:18:45 +01:00
Christopher Faulet	0159ee4032	MINOR: stats: Report internal errors in the proxies/listeners/servers stats The stats field ST_F_EINT has been added to report internal errors encountered per proxy, per listener and per server. It appears in the CLI export and on the HTML stats page.	2020-01-20 15:18:45 +01:00
Christopher Faulet	30a2a3724b	MINOR: http-rules: Add more return codes to let custom actions act as normal ones When HTTP/TCP rules are evaluated, especially HTTP ones, some results are possible for normal actions and not for custom ones. So missing return codes (ACT_RET_) have been added to let custom actions act as normal ones. Concretely following codes have been added: * ACT_RET_DENY : deny the request/response. It must be handled by the caller * ACT_RET_ABRT : abort the request/response, handled by action itsleft. * ACT_RET_INV : invalid request/response	2020-01-20 15:18:45 +01:00
Christopher Faulet	4d90db5f4c	MINOR: http-rules: Add a rule result to report internal error Now, when HTTP rules are evaluated, HTTP_RULE_RES_ERROR must be returned when an internal error is catched. It is a way to make the difference between a bad request or a bad response and an error during its processing.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d4ce6c2957	MINOR: counters: Add a counter to report internal processing errors This counter, named 'internal_errors', has been added in frontend and backend counters. It should be used when a internal error is encountered, instead for failed_req or failed_resp.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cb5501327c	BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules Functions to deinitialize the HTTP rules are buggy. These functions does not check the action name to release the right part in the arg union. Only few info are released. For auth rules, the realm is released and there is no problem here. But the regex <arg.hdr_add.re> is always unconditionally released. So it is easy to make these functions crash. For instance, with the following rule HAProxy crashes during the deinit : http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)] For now, These functions are simply removed and we rely on the deinit function used for TCP rules (renamed as deinit_act_rules()). This patch fixes the bug. But arguments used by actions are not released at all, this part will be addressed later. This patch must be backported to all stable versions.	2020-01-20 15:18:45 +01:00
Willy Tarreau	ee1a6fc943	MINOR: connection: make the last arg of subscribe() a struct wait_event* The subscriber used to be passed as a "void param" that was systematically cast to a struct wait_event. By now it appears clear that the subscribe() call at every layer is well defined and always takes a pointer to an event subscriber of type wait_event, so let's enforce this in the functions' prototypes, remove the intermediary variables used to cast it and clean up the comments to clarify what all these functions do in their context.	2020-01-17 18:30:37 +01:00
Willy Tarreau	7872d1fc15	MEDIUM: connection: merge the send_wait and recv_wait entries In practice all callers use the same wait_event notification for any I/O so instead of keeping specific code to handle them separately, let's merge them and it will allow us to create new events later.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3a9312af8f	REORG: stream/backend: move backend-specific stuff to backend.c For more than a decade we've kept all the sess_update_st_*() functions in stream.c while they're only there to work in relation with what is currently being done in backend.c (srv_redispatch_connect, connect_server, etc). Let's move all this pollution over there and take this opportunity to try to find slightly less confusing names for these old functions whose role is only to handle transitions from one specific stream-int state: sess_update_st_rdy_tcp() -> back_handle_st_rdy() sess_update_st_con_tcp() -> back_handle_st_con() sess_update_st_cer() -> back_handle_st_cer() sess_update_stream_int() -> back_try_conn_req() sess_prepare_conn_req() -> back_handle_st_req() sess_establish() -> back_establish() The last one remained in stream.c because it's more or less a completion function which does all the initialization expected on a connection success or failure, can set analysers and emit logs. The other ones could possibly slightly benefit from being modified to take a stream-int instead since it's really what they're working with, but it's unimportant here.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3381bf89e3	MEDIUM: connection: get rid of CO_FL_CURR_* flags These ones used to serve as a set of switches between CO_FL_SOCK_* and CO_FL_XPRT_, and now that the SOCK layer is gone, they're always a copy of the last know CO_FL_XPRT_ ones that is resynchronized before I/O events by calling conn_refresh_polling_flags(), and that are pushed back to FDs when detecting changes with conn_xprt_polling_changes(). While these functions are not particularly heavy, what they do is totally redundant by now because the fd_want_/fd_stop_() actions already perform test-and-set operations to decide to create an entry or not, so they do the exact same thing that is done by conn_xprt_polling_changes(). As such it is pointless to call that one, and given that the only reason to keep CO_FL_CURR_* is to detect changes there, we can now remove them. Even if this does only save very few cycles, this removes a significant complexity that has been responsible for many bugs in the past, including the last one affecting FreeBSD. All tests look good, and no performance regressions were observed.	2020-01-17 17:45:12 +01:00
Willy Tarreau	e2a0eeca77	MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared by the stream-int when splicing is disabled, as well as in conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could be attempted by the I/O callbacks called from conn_fd_handler(). This clearing in conn_refresh_polling_flags() makes no sense anymore and is in no way related to the polling at all. Since we don't call them from there anymore it's better to clear it before attempting to receive, and to set it again later. So let's move this operation where it should be, in raw_sock_to_pipe() so that it's now symmetric. It was also placed in raw_sock_to_buf() so that we're certain that it gets cleared if an attempt to splice is replaced with a subsequent attempt to recv(). And these were currently already achieved by the call to conn_refresh_polling_flags(). Now it could theorically be removed from the stream-int.	2020-01-17 17:19:27 +01:00
Willy Tarreau	17ccd1a356	BUG/MEDIUM: connection: add a mux flag to indicate splice usability Commit `c640ef1a7d` ("BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible") fixed splicing in TCP and legacy mode but broke it badly in HTX mode. What happens in HTX mode is that the channel's to_forward value remains set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is not a reliable signal anymore to indicate whether more data are expected or not. Thus, when data are spliced out of the mux using rcv_pipe(), even when the end is reached (that only the mux knows about), the call to rcv_buf() to get the final HTX blocks completing the message were skipped and there was often no new event to wake this up, resulting in transfer timeouts at the end of large objects. All this goes down to the fact that the channel has no more information about whether it can splice or not despite being the one having to take the decision to call rcv_pipe() or not. And we cannot afford to call rcv_buf() inconditionally because, as the commit above showed, this reduces the forwarding performance by 2 to 3 in TCP and legacy modes due to data lying in the buffer preventing splicing from being used later. The approach taken by this patch consists in offering the muxes the ability to report a bit more information to the upper layers via the conn_stream. This information could simply be to indicate that more data are awaited but the real need being to distinguish splicing and receiving, here instead we clearly report the mux's willingness to be called for splicing or not. Hence the flag's name, CS_FL_MAY_SPLICE. The mux sets this flag when it knows that its buffer is empty and that data waiting past what is currently known may be spliced, and clears it when it knows there's no more data or that the caller must fall back to rcv_buf() instead. The stream-int code now uses this to determine if splicing may be used or not instead of looking at the rcv_pipe() callbacks through the whole chain. And after the rcv_pipe() call, it checks the flag again to decide whether it may safely skip rcv_buf() or not. All this bitfield dance remains a bit complex and it starts to appear obvious that splicing vs reading should be a decision of the mux based on permission granted by the data layer. This would however increase the API's complexity but definitely need to be thought about, and should even significantly simplify the data processing layer. The way it was integrated in mux-h1 will also result in no more calls to rcv_pipe() on chunked encoded data, since these ones are currently disabled at the mux level. However once the issue with chunks+splice is fixed, it will be important to explicitly check for curr_len\|CHNK to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk. This fix must be backported to 2.1 and 2.0.	2020-01-17 17:00:12 +01:00
Willy Tarreau	340b07e868	BUG/MAJOR: hashes: fix the signedness of the hash inputs Wietse Venema reported in the thread below that we have a signedness issue with our hashes implementations: due to the use of const char* for the input key that's often text, the crc32, sdbm, djb2, and wt6 algorithms return a platform-dependent value for binary input keys containing bytes with bit 7 set. This means that an ARM or PPC platform will hash binary inputs differently from an x86 typically. Worse, some algorithms are well defined in the industry (like CRC32) and do not provide the expected result on x86, possibly causing interoperability issues (e.g. a user-agent would fail to compare the CRC32 of a message body against the one computed by haproxy). Fortunately, and contrary to the first impression, the CRC32c variant used in the PROXY protocol processing is not affected. Thus the impact remains very limited (the vast majority of input keys are text-based, such as user-agent headers for exmaple). This patch addresses the issue by fixing all hash functions' prototypes (even those not affected, for API consistency). A reg test will follow in another patch. The vast majority of users do not use these hashes. And among those using them, very few will pass them on binary inputs. However, for the rare ones doing it, this fix MAY have an impact during the upgrade. For example if the package is upgraded on one LB then on another one, and the CRC32 of a binary input is used as a stick table key (why?) then these CRCs will not match between both nodes. Similarly, if "hash-type ... crc32" is used, LB inconsistency may appear during the transition. For this reason it is preferable to apply the patch on all nodes using such hashes at the same time. Systems upgraded via their distros will likely observe the least impact since they're expected to be upgraded within a short time frame. And it is important for distros NOT to skip this fix, in order to avoid distributing an incompatible implementation of a hash. This is the reason why this patch is tagged as MAJOR, eventhough it's extremely unlikely that anyone will ever notice a change at all. This patch must be backported to all supported branches since the hashes were introduced in 1.5-dev20 (commit `98634f0c`). Some parts may be dropped since implemented later. Link to Wietse's report: https://marc.info/?l=postfix-users&m=157879464518535&w=2	2020-01-16 08:23:42 +01:00
Willy Tarreau	f31af9367e	MEDIUM: lua: don't call the GC as often when dealing with outgoing connections In order to properly close connections established from Lua in case a Lua context dies, the context currently automatically gets a flag HLUA_MUST_GC set whenever an outgoing connection is used. This causes the GC to be enforced on the context's death as well as on yield. First, it does not appear necessary to do it when yielding, since if the connections die they are already cleaned up. Second, the problem with the flag is that even if a connection gets properly closed, the flag is not removed and the GC continues to be called on the Lua context. The impact on performance looks quite significant, as noticed and diagnosed by Sadasiva Gujjarlapudi in the following thread: https://www.mail-archive.com/haproxy@formilux.org/msg35810.html This patch changes the flag for a counter so that each created connection increments it and each cleanly closed connection decrements it. That way we know we have to call the GC on the context's death only if the count is non-null. As reported in the thread above, the Lua performance gain is now over 20% by doing this. Thanks to Sada and Thierry for the design discussion and tests that led to this solution.	2020-01-14 10:12:31 +01:00
Olivier Houchard	3c4f40acbf	BUG/MEDIUM: tasks: Use the MT macros in tasklet_free(). In tasklet_free(), to attempt to remove ourself, use MT_LIST_DEL, we can't just use LIST_DEL(), as we theorically could be in the shared tasklet list. This should be backported to 2.1.	2020-01-10 16:56:59 +01:00
Florian Tham	9205fea13a	MINOR: http: Add 404 to http-request deny This patch adds http status code 404 Not Found to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Florian Tham	272e29b5cc	MINOR: http: Add 410 to http-request deny This patch adds http status code 410 Gone to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Willy Tarreau	eaf05be0ee	OPTIM: polling: do not create update entries for FD removal In order to reduce the number of poller updates, we can benefit from the fact that modern pollers use sampling to report readiness and that under load they rarely report the same FD multiple times in a row. As such it's not always necessary to disable such FDs especially when we're almost certain they'll be re-enabled again and will require another set of syscalls. Now instead of creating an update for a (possibly temporary) removal, we only perform this removal if the FD is reported again as ready while inactive. In addition this is performed via another update so that alternating workloads like transfers have a chance to re-enable the FD without any syscall during the loop (typically after the data that filled a buffer have been sent). However we only do that for single- threaded FDs as the other ones require a more complex setup and are not on the critical path. This does cause a few spurious wakeups but almost totally eliminates the calls to epoll_ctl() on connections seeing intermitent traffic like HTTP/1 to a server or client. A typical example with 100k requests for 4 kB objects over 200 connections shows that the number of epoll_ctl() calls doesn't depend on the number of requests anymore but most exclusively on the number of established connections: Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 57.09 0.499964 0 654361 321190 recvfrom 38.33 0.335741 0 369097 1 epoll_wait 4.56 0.039898 0 44643 epoll_ctl 0.02 0.000211 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.875814 1068301 321391 total After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 59.25 0.504676 0 657600 323630 recvfrom 40.68 0.346560 0 374289 1 epoll_wait 0.04 0.000370 0 620 epoll_ctl 0.03 0.000228 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.851834 1032709 323831 total As expected there is also a slight increase of epoll_wait() calls since delaying de-activation of events can occasionally cause one spurious wakeup.	2019-12-27 16:38:47 +01:00
Willy Tarreau	19689882e6	MINOR: poller: do not call the IO handler if the FD is not active For now this almost never happens but with subsequent patches it will become more important not to uselessly call the I/O handlers if the FD is not active.	2019-12-27 16:38:47 +01:00
Willy Tarreau	0fbc318e24	CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE Both flags became equal in commit `82967bf9` ("MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the overlap between xprt_done_cb() and wake() after the removal of the DATA specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the "_DONE" version already covers everything and explains the intent well enough.	2019-12-27 16:38:47 +01:00
Willy Tarreau	4970e5adb7	REORG: connection: move tcp_connect_probe() to conn_fd_check() The function is not TCP-specific at all, it covers all FD-based sockets so let's move this where other similar functions are, in connection.c, and rename it conn_fd_check().	2019-12-27 16:38:43 +01:00
Willy Tarreau	11ef0837af	MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP In practice it's all pollers except select(). It turns out that we're keeping some legacy code only for select and enforcing it on all pollers, let's offer the pollers the ability to declare that they do not need that.	2019-12-27 14:04:33 +01:00
Lukas Tribus	a26d1e1324	BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled with the no-deprecated option. Remove existing, incomplete guards and add a compatibility macro in openssl-compat.h, just as OpenSSL does: `bf4006a6f9/include/openssl/ssl.h (L1486)` This should be backported as far as 2.0 and probably even 1.9.	2019-12-21 06:46:55 +01:00
Rosen Penev	b3814c2ca8	BUG/MINOR: ssl: openssl-compat: Fix getm_ defines LIBRESSL_VERSION_NUMBER evaluates to 0 under OpenSSL, making the condition always true. Check for the define before checking it. Signed-off-by: Rosen Penev <rosenp@gmail.com> [wt: to be backported as far as 1.9]	2019-12-20 16:01:31 +01:00
Willy Tarreau	dd0e89a084	BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing Since 1.9 with commit `b20aa9eef3` ("MAJOR: tasks: create per-thread wait queues") a task bound to a single thread will not use locks when being queued or dequeued because the wait queue is assumed to be the owner thread's. But there exists a rare situation where this is not true: the health check tasks may be running on one thread waiting for a response, and may in parallel be requeued by another thread calling health_adjust() after a detecting a response error in traffic when "observe l7" is set, and "fastinter" is lower than "inter", requiring to shorten the running check's timeout. In this case, the task being requeued was present in another thread's wait queue, thus opening a race during task_unlink_wq(), and gets requeued into the calling thread's wait queue instead of the running one's, opening a second race here. This patch aims at protecting against the risk of calling task_unlink_wq() from one thread while the task is queued on another thread, hence unlocked, by introducing a new TASK_SHARED_WQ flag. This new flag indicates that a task's position in the wait queue may be adjusted by other threads than then one currently executing it. This means that such WQ manipulations must be performed under a lock. There are two types of such tasks: - the global ones, using the global wait queue (technically speaking, those whose thread_mask has at least 2 bits set). - some local ones, which for now will be placed into the global wait queue as well in order to benefit from its lock. The flag is automatically set on initialization if the task's thread mask indicates more than one thread. The caller must also set it if it intends to let other threads update the task's expiration delay (e.g. delegated I/Os), or if it intends to change the task's affinity over time as this could lead to the same situation. Right now only the situation described above seems to be affected by this issue, and it is very difficult to trigger, and even then, will often have no visible effect beyond stopping the checks for example once the race is met. On my laptop it is feasible with the following config, chained to httpterm: global maxconn 400 # provoke FD errors, calling health_adjust() defaults mode http timeout client 10s timeout server 10s timeout connect 10s listen px bind :8001 option httpchk /?t=50 server sback 127.0.0.1:8000 backup server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7 This patch will automatically address the case for the checks because check tasks are created with multiple threads bound and will get the TASK_SHARED_WQ flag set. If in the future more tasks need to rely on this (multi-threaded muxes for example) and the use of the global wait queue becomes a bottleneck again, then it should not be too difficult to place locks on the local wait queues and queue the task on its bound thread. This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task". Many thanks to William Dauchy for providing detailed traces allowing to spot the problem.	2019-12-19 14:42:22 +01:00
Christopher Faulet	76014fd118	MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state During H1 parsing, the HTX EOM block is added before switching the message state to H1_MSG_DONE. It is an exception in the way to convert an H1 message to HTX. Except for this block, the message is first switched to the right state before starting to add the corresponding HTX blocks. For instance, the message is switched in H1_MSG_DATA state and then the HTX DATA blocks are added. With this patch, the message is switched to the H1_MSG_DONE state when all data blocks or trailers were processed. It is the caller responsibility to call h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far easier to catch failures when the HTX buffer is full. The H1 and FCGI muxes have been updated accordingly. This patch may eventually be backported to 2.1 if it helps other backports.	2019-12-11 16:46:16 +01:00
Willy Tarreau	fec56c6a76	BUG/MINOR: listener: fix off-by-one in state name check As reported in issue #380, the state check in listener_state_str() is invalid as it allows state value 9 to report crap. We don't use such a state value so the issue should never happen unless the memory is already corrupted, but better clean this now while it's harmless. This should be backported to all maintained branches.	2019-12-11 15:51:37 +01:00
Willy Tarreau	d26c9f9465	BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers If a new process is started with -sf and it fails to bind, it may send a SIGTTOU to the master process in hope that it will temporarily unbind. Unfortunately this one doesn't catch it and stops to background instead of forwarding the signal to the workers. The same is true for SIGTTIN. This commit simply implements an extra signal handler for the master to deal with such signals that must be passed down to the workers. It must be backported as far as 1.8, though there the code differs in that it's entirely in haproxy.c and doesn't require an extra sig handler.	2019-12-11 14:26:53 +01:00
Willy Tarreau	c49ba52524	MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups We used to have wake_expired_tasks() wake up tasks and return the next expiration delay. The problem this causes is that we have to call it just before poll() in order to consider latest timers, but this also means that we don't wake up all newly expired tasks upon return from poll(), which thus systematically requires a second poll() round. This is visible when running any scheduled task like a health check, as there are systematically two poll() calls, one with the interval, nothing is done after it, and another one with a zero delay, and the task is called: listen test bind *:8001 server s1 127.0.0.1:1111 check 09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0 09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0 09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0 >> nothing run here, as the expired task was not woken up yet. 09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0 09:37:39.202505 epoll_wait(3, [], 200, 0) = 0 09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0 >> now the expired task was woken up 09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0 09:37:39.202683 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0 09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:37:39.202715 close(7) = 0 Let's instead split the function in two parts: - the first part, wake_expired_tasks(), called just before process_runnable_tasks(), wakes up all expired tasks; it doesn't compute any timeout. - the second part, next_timer_expiry(), called just before poll(), only computes the next timeout for the current thread. Thanks to this, all expired tasks are properly woken up when leaving poll, and each poll call's timeout remains up to date: 09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0 09:41:16.270457 epoll_wait(3, [], 200, 999) = 0 09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0 09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0 09:41:17.270323 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0 09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:41:17.270367 close(7) = 0 This may be backported to 2.1 and 2.0 though it's unlikely to bring any user-visible improvement except to clarify debugging.	2019-12-11 09:42:58 +01:00
Willy Tarreau	440d09b244	BUG/MINOR: tasks: only requeue a task if it was already in the queue Commit `0742c314c3` ("BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().") had a slight side effect on expired timeouts, which is that when used before a timeout is updated, it will cause an existing task to be requeued earlier than its expected timeout when done before being updated, resulting in the next poll wakup timeout too early or even instantly if the previous wake up was done on a timeout. This is visible in strace when health checks are enabled because there are two poll calls, one of which has a short or zero delay. The correct solution is to only requeue a task if it was already in the queue. This can be backported to all branches having the fix above.	2019-12-11 09:21:36 +01:00
Willy Tarreau	a1d97f88e0	REORG: listener: move the global listener queue code to listener.c The global listener queue code and declarations were still lying in haproxy.c while not needed there anymore at all. This complicates the code for no reason. As a result, the global_listener_queue_task and the global_listener_queue were made static.	2019-12-10 14:16:03 +01:00
Willy Tarreau	241797a3fc	MINOR: listener: split dequeue_all_listener() in two We use it half times for the global_listener_queue and half times for a proxy's queue and this requires the callers to take care of these. Let's split it in two versions, the current one working only on the global queue and another one dedicated to proxies for the per-proxy queues. This cleans up quite a bit of code.	2019-12-10 14:14:09 +01:00
Willy Tarreau	a45a8b5171	MEDIUM: init: set NO_NEW_PRIVS by default when supported HAProxy doesn't need to call executables at run time (except when using external checks which are strongly recommended against), and is even expected to isolate itself into an empty chroot. As such, there basically is no valid reason to allow a setuid executable to be called without the user being fully aware of the risks. In a situation where haproxy would need to call external checks and/or disable chroot, exploiting a vulnerability in a library or in haproxy itself could lead to the execution of an external program. On Linux it is possible to lock the process so that any setuid bit present on such an executable is ignored. This significantly reduces the risk of privilege escalation in such a situation. This is what haproxy does by default. In case this causes a problem to an external check (for example one which would need the "ping" command), then it is possible to disable this protection by explicitly adding this directive in the global section. If enabled, it is possible to turn it back off by prefixing it with the "no" keyword. Before the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" uid=0(root) gid=0(root) groups=0(root After the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?	2019-12-06 17:20:26 +01:00
Olivier Houchard	0742c314c3	BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity(). In task_set_affinity(), leave the wait_queue if any before changing the affinity, and re-enter a wait queue once it is done. If we don't do that, the task may stay in the wait queue of another thread, and we later may end up modifying that wait queue while holding no lock, which could lead to memory corruption. THis should be backported to 2.1, 2.0 and 1.9.	2019-12-05 15:11:19 +01:00
Willy Tarreau	d96f1126fe	MEDIUM: init: prevent process and thread creation at runtime Some concerns are regularly raised about the risk to inherit some Lua files which make use of a fork (e.g. via os.execute()) as well as whether or not some of bugs we fix might or not be exploitable to run some code. Given that haproxy is event-driven, any foreground activity completely stops processing and is easy to detect, but background activity is a different story. A Lua script could very well discretely fork a sub-process connecting to a remote location and taking commands, and some injected code could also try to hide its activity by creating a process or a thread without blocking the rest of the processing. While such activities should be extremely limited when run in an empty chroot without any permission, it would be better to get a higher assurance they cannot happen. This patch introduces something very simple: it limits the number of processes and threads to zero in the workers after the last thread was created. By doing so, it effectively instructs the system to fail on any fork() or clone() syscall. Thus any undesired activity has to happen in the foreground and is way easier to detect. This will obviously break external checks (whose concept is already totally insecure), and for this reason a new option "insecure-fork-wanted" was added to disable this protection, and it is suggested in the fork() error report from the checks. It is obviously recommended not to use it and to reconsider the reasons leading to it being enabled in the first place. If for any reason we fail to disable forks, we still start because it could be imaginable that some operating systems refuse to set this limit to zero, but in this case we emit a warning, that may or may not be reported since we're after the fork point. Ideally over the long term it should be conditionned by strict-limits and cause a hard fail.	2019-12-03 11:49:00 +01:00
Emmanuel Hocdet	e9a100e982	BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0 Commit `d4f9a60e` "MINOR: ssl: deduplicate ca-file" uses undeclared X509 functions when build with openssl < 1.1.0. Introduce this functions in openssl-compat.h . Fix issue #385.	2019-12-03 07:13:12 +01:00
Emmanuel Hocdet	d4f9a60ee2	MINOR: ssl: deduplicate ca-file Typically server line like: 'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt' load ca-certificates.crt 1000 times and stay duplicated in memory. Same case for bind line: ca-file is loaded for each certificate. Same 'ca-file' can be load one time only and stay deduplicated in memory. As a corollary, this will prevent file access for ca-file when updating a certificate via CLI.	2019-11-28 11:11:20 +01:00
Willy Tarreau	cdb27e8295	MINOR: version: this is development again, update the status It's basically a revert of commit `9ca7f8cea`.	2019-11-25 20:38:32 +01:00
Willy Tarreau	2e077f8d53	[RELEASE] Released version 2.2-dev0 Released version 2.2-dev0 with the following main changes : - exact copy of 2.1.0	2019-11-25 20:36:16 +01:00
Willy Tarreau	9ca7f8ceac	MINOR: version: indicate that this version is stable Also indicate that it will get fixes till ~Q1 2021.	2019-11-25 19:47:23 +01:00
Willy Tarreau	c22d5dfeb8	MINOR: h2: add a function to report H2 error codes as strings Just like we have frame type to string, let's have error to string to improve debugging and traces.	2019-11-25 11:34:26 +01:00
Willy Tarreau	8f3ce06f14	MINOR: ist: add ist_find_ctl() This new function looks for the first control character in a string (a char whose value is between 0x00 and 0x1F included) and returns it, or NULL if there is none. It is optimized for quickly evicting non-matching strings and scans ~0.43 bytes per cycle. It can be used as an accelerator when it's needed to look up several of these characters (e.g. CR/LF/NUL).	2019-11-25 10:33:35 +01:00
Willy Tarreau	47479eb0e7	MINOR: version: emit the link to the known bugs in output of "haproxy -v" The link to the known bugs page for the current version is built and reported there. When it is a development version (less than 2 dots), instead a link to github open issues is reported as there's no way to be sure about the current situation in this case and it's better that users report their trouble there.	2019-11-21 18:48:20 +01:00
Willy Tarreau	08dd202d73	MINOR: version: report the version status in "haproxy -v" As discussed on Discourse here: https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466 it's not always easy for end users to know the lifecycle of the version they are using. This patch introduces a "Status" line in the output of "haproxy -vv" indicating whether it's a development, stable, long-term supported version, possibly with an estimated end of life for the branch when it can be anticipated (e.g. for stable versions). This field should be adjusted when creating a major release to reflect the new status. It may make sense to backport this to other branches to clarify the situation.	2019-11-21 18:47:54 +01:00
William Lallemand	8b453912ce	MINOR: ssl: ssl_sock_prepare_ctx() return an error code Rework ssl_sock_prepare_ctx() so it fills a buffer with the error messages instead of using ha_alert()/ha_warning(). Also returns an error code (ERR_*) instead of the number of errors.	2019-11-21 17:48:11 +01:00
Daniel Corbett	f8716914c7	MEDIUM: dns: Add resolve-opts "ignore-weight" It was noted in #48 that there are times when a configuration may use the server-template directive with SRV records and simultaneously want to control weights using an agent-check or through the runtime api. This patch adds a new option "ignore-weight" to the "resolve-opts" directive. When specified, any weight indicated within an SRV record will be ignored. This is for both initial resolution and ongoing resolution.	2019-11-21 17:25:31 +01:00
Fr�d�ric L�caille	ec1c10b839	MINOR: peers: Add debugging information to "show peers". This patch adds three counters to help in debugging peers protocol issues to "peer" struct: ->no_hbt counts the number of reconnection period without receiving heartbeat ->new_conn counts the number of reconnections after ->reconnect timeout expirations. ->proto_err counts the number of protocol errors.	2019-11-19 14:48:28 +01:00
Fr�d�ric L�caille	33cab3c0eb	MINOR: peers: Add TX/RX heartbeat counters. Add RX/TX heartbeat counters to "peer" struct to have an idead about which peer is alive or not. Dump these counters values on the CLI via "show peers" command.	2019-11-19 14:48:25 +01:00

... 28 29 30 31 32 ...

6744 Commits