haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-12 10:06:58 +02:00

Author	SHA1	Message	Date
Christopher Faulet	18630643a9	MINOR: http-htx: Use a dedicated function to release http_reply objects A function to release an http_reply object has been added. It is now called when an http return rule is released.	2020-05-20 18:27:13 +02:00
Christopher Faulet	5ff0c64921	MINOR: http-rules: Use http_reply structure for http return rules No real change here. Instead of using an internal structure to the action rule, the http return rules are now stored as an http reply. The main change is about the action type. It is now always set to ACT_CUSTOM. The http reply type is used to know how to evaluate the rule.	2020-05-20 18:27:13 +02:00
Christopher Faulet	b6ea17c6fc	CLEANUP: http-htx: Rename http_error structure into http_error_msg The structure owns an error message, most of time loaded from a file, and converted to HTX. It is created when an errorfile or errorloc directive is parsed. It is renamed to avoid ambiguities with http_reply structure.	2020-05-20 18:27:13 +02:00
Christopher Faulet	7bd3de06e7	MINOR: http-htx: Add http_reply type based on what is used for http return rules The http_reply structure is added. It represents a generic HTTP message used as internal response by HAProxy. It is based on the structure used to store http return rules. The aim is to store all error messages using this structure, as well as http return and http deny rules.	2020-05-20 18:27:13 +02:00
Christopher Faulet	a53abad42d	CLEANUP: http_ana: Remove unused TXN flags TX_CLDENY, TX_CLALLOW, TX_SVDENY and TX_SVALLOW flags are unused. Only TX_CLTARPIT is used to make the difference between an http deny rule and an http tarpit rule. So these unused flags are removed.	2020-05-20 18:27:13 +02:00
William Lallemand	8177ad9895	MINOR: ssl: split config and runtime variable for ssl-{min,max}-ver In the CLI command 'show ssl crt-list', the ssl-min-ver and the ssl-min-max arguments were always displayed because the dumped versions were the actual version computed and used by haproxy, instead of the version found in the configuration. To fix the problem, this patch separates the variables to have one with the configured version, and one with the actual version used. The dump only shows the configured version.	2020-05-20 16:49:02 +02:00
Willy Tarreau	d68a6927f7	Revert "MEDIUM: sink: add global statement to create a new ring (sink buffer)" This reverts commit `957ec59571`. As discussed with Emeric, the current syntax is not extensible enough, this will be turned to a section instead in a forthcoming patch.	2020-05-20 12:06:16 +02:00
Willy Tarreau	928068a74b	MINOR: ring: make the applet code not depend on the CLI The ring to applet communication was only made to deal with CLI functions but it's generic. Let's have generic appctx functions and have the CLI rely on these instead. This patch introduces ring_attach_appctx() and ring_detach_appctx().	2020-05-19 19:37:12 +02:00
Willy Tarreau	9597cbd17a	MINOR: applet: adopt the wait list entry from the CLI A few fields, including a generic list entry, were added to the CLI context by commit `300decc8d9` ("MINOR: cli: extend the CLI context with a list and two offsets"). It turns out that the list entry (l0) is solely used to consult rings and that the generic ring_write() code is restricted to a consumer on the CLI due to this, which was not the initial intent. Let's make it a general purpose wait_entry field that is properly initialized during appctx_init(). This will allow any applet to wait on a ring, not just the CLI.	2020-05-19 19:37:12 +02:00
Willy Tarreau	2bdcc70fa7	MEDIUM: hpack: use a pool for the hpack table Instead of using malloc/free to allocate an HPACK table, let's declare a pool. However the HPACK size is configured by the H2 mux, so it's also this one which allocates it after post_check.	2020-05-19 11:40:39 +02:00
Emeric Brun	957ec59571	MEDIUM: sink: add global statement to create a new ring (sink buffer) This patch adds the new global statement: ring <name> [desc <desc>] [format <format>] [size <size>] [maxlen <length>] Creates a named ring buffer which could be used on log line for instance. <desc> is an optionnal description string of the ring. It will appear on CLI. By default, <name> is reused to fill this field. <format> is the log format used when generating syslog messages. It may be one of the following : iso A message containing only the ISO date, followed by the text. The PID, process name and system name are omitted. This is designed to be used with a local log server. raw A message containing only the text. The level, PID, date, time, process name and system name are omitted. This is designed to be used in containers or during development, where the severity only depends on the file descriptor used (stdout/stderr). This is the default. rfc3164 The RFC3164 syslog message format. This is the default. (https://tools.ietf.org/html/rfc3164) rfc5424 The RFC5424 syslog message format. (https://tools.ietf.org/html/rfc5424) short A message containing only a level between angle brackets such as '<3>', followed by the text. The PID, date, time, process name and system name are omitted. This is designed to be used with a local log server. This format is compatible with what the systemd logger consumes. timed A message containing only a level between angle brackets such as '<3>', followed by ISO date and by the text. The PID, process name and system name are omitted. This is designed to be used with a local log server. <length> is the maximum length of event message stored into the ring, including formatted header. If the event message is longer than <length>, it would be truncated to this length. <name> is the ring identifier, which follows the same naming convention as proxies and servers. <size> is the optionnal size in bytes. Default value is set to BUFSIZE. Note: Historically sink's name and desc were refs on const strings. But with new configurable rings a dynamic allocation is needed.	2020-05-19 11:04:11 +02:00
Emeric Brun	e709e1e777	MEDIUM: logs: buffer targets now rely on new sink_write Before this path, they rely directly on ring_write bypassing a part of the sink API. Now the maxlen parameter of the log will apply only on the text message part (and not the header, for this you woud prefer to use the maxlen parameter on the sink/ring). sink_write prototype was also reviewed to return the number of Bytes written to be compliant with the other write functions.	2020-05-19 11:04:11 +02:00
Emeric Brun	bd163817ed	MEDIUM: sink: build header in sink_write for log formats This patch extends the sink_write prototype and code to handle the rfc5424 and rfc3164 header. It uses header building tools from log.c. Doing this some functions/vars have been externalized. facility and minlevel have been removed from the struct sink and passed to args at sink_write because they depends of the log and not of the sink (they remained unused by rest of the code until now).	2020-05-19 11:04:11 +02:00
William Dauchy	1665c43fd8	BUILD: ssl: include buffer common headers for ssl_sock_ctx since commit `c0cdaffaa3` ("REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues"), `struct ssl_sock_ctx` was moved in ssl_sock.h. As it contains a `struct buffer`, including `common/buffer.h` is now mandatory. I encountered an issue while including ssl_sock.h on another patch: include/types/ssl_sock.h:240:16: error: field ‘early_buf’ has incomplete type 240 \| struct buffer early_buf; /* buffer to store the early data received */ no backport needed. Fixes: `c0cdaffaa3` ("REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues") Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-05-18 08:29:32 +02:00
Marcin Deranek	4dc2b57d51	MINOR: stats: Prepare for more accurate moving averages Add swrate_add_dynamic function which is similar to swrate_add, but more accurate when calculating moving averages when not enough samples have been processed yet.	2020-05-16 22:40:00 +02:00
William Lallemand	6a66a5ec9b	REORG: ssl: move utility functions to src/ssl_utils.c These functions are mainly used to extract information from certificates.	2020-05-15 14:11:54 +02:00
William Lallemand	15e169447d	REORG: ssl: move sample fetches to src/ssl_sample.c Move all SSL sample fetches to src/ssl_sample.c.	2020-05-15 14:11:54 +02:00
William Lallemand	c0cdaffaa3	REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues In order to move all SSL sample fetches in another file, moving the ssl_sock_ctx definition in a .h file is required. Unfortunately it became a cross dependencies hell to solve, because of the struct wait_event field, so <types/connection.h> is needed which created other problems.	2020-05-15 14:11:54 +02:00
William Lallemand	ef76107a4b	MINOR: ssl: remove static keyword in some SSL utility functions In order to move the the sample fetches to another file, remove the static keyword of some utility functions in the SSL fetches.	2020-05-15 14:11:54 +02:00
William Lallemand	dad3105157	REORG: ssl: move ssl configuration to cfgparse-ssl.c Move all the configuration parsing of the ssl keywords in cfgparse-ssl.c	2020-05-15 14:11:54 +02:00
William Lallemand	da8584c1ea	REORG: ssl: move the CLI 'cert' functions to src/ssl_ckch.c Move the 'ssl cert' CLI functions to src/ssl_ckch.c.	2020-05-15 14:11:54 +02:00
William Lallemand	c756bbd3df	REORG: ssl: move the crt-list CLI functions in src/ssl_crtlist.c Move the crtlist functions for the CLI to src/ssl_crtlist.c	2020-05-15 14:11:54 +02:00
William Lallemand	03c331c80a	REORG: ssl: move the ckch_store related functions to src/ssl_ckch.c Move the cert_key_and_chain functions: int ssl_sock_load_files_into_ckch(const char path, struct cert_key_and_chain ckch, char *err); int ssl_sock_load_pem_into_ckch(const char path, char buf, struct cert_key_and_chain ckch , char *err); void ssl_sock_free_cert_key_and_chain_contents(struct cert_key_and_chain ckch); int ssl_sock_load_key_into_ckch(const char path, char buf, struct cert_key_and_chain ckch , char err); int ssl_sock_load_ocsp_response_from_file(const char ocsp_path, char buf, struct cert_key_and_chain ckch, char *err); int ssl_sock_load_sctl_from_file(const char sctl_path, char buf, struct cert_key_and_chain ckch, char *err); int ssl_sock_load_issuer_file_into_ckch(const char path, char buf, struct cert_key_and_chain ckch, char *err); And the utility ckch_store functions: void ckch_store_free(struct ckch_store store) struct ckch_store ckch_store_new(const char filename, int nmemb) struct ckch_store ckchs_dup(const struct ckch_store src) ckch_store ckchs_lookup(char path) ckch_store ckchs_load_cert_file(char path, int multi, char **err)	2020-05-15 14:11:54 +02:00
William Lallemand	c1c50b46e9	CLEANUP: ssl: avoid circular dependencies in ssl_crtlist.h Add forward declarations in types/ssl_crtlist.h in order to avoid circular dependencies. Also remove the listener.h include which is not needed anymore.	2020-05-15 14:11:54 +02:00
William Lallemand	6e9556b635	REORG: ssl: move crtlist functions to src/ssl_crtlist.c Move the crtlist functions to src/ssl_crtlist.c and their definitions to proto/ssl_crtlist.h. The following functions were moved: /* crt-list entry functions / void ssl_sock_free_ssl_conf(struct ssl_bind_conf conf); char crtlist_dup_filters(char args, int fcount); void crtlist_free_filters(char *args); void crtlist_entry_free(struct crtlist_entry entry); struct crtlist_entry crtlist_entry_new(); / crt-list functions / void crtlist_free(struct crtlist crtlist); struct crtlist crtlist_new(const char filename, int unique); /* file loading / int crtlist_parse_line(char line, char *crt_path, struct crtlist_entry entry, const char file, int linenum, char err); int crtlist_parse_file(char file, struct bind_conf bind_conf, struct proxy curproxy, struct crtlist crtlist, char err); int crtlist_load_cert_dir(char path, struct bind_conf bind_conf, struct crtlist crtlist, char err);	2020-05-15 14:11:54 +02:00
William Lallemand	c69973f7eb	CLEANUP: ssl: add ckch prototypes in proto/ssl_ckch.h Remove the static definitions of the ckch functions and add them to ssl_ckch.h in order to use them outside ssl_sock.c.	2020-05-15 14:11:54 +02:00
William Lallemand	d4632b2b6d	REORG: ssl: move the ckch structures to types/ssl_ckch.h Move all the structures used for loading the SSL certificates in ssl_ckch.h	2020-05-15 14:11:54 +02:00
William Lallemand	be21b663cd	REORG: move the crt-list structures in their own .h Move the structure definitions specifics to the crt-list in types/ssl_crtlist.h.	2020-05-15 14:11:54 +02:00
William Lallemand	7fd8b4567e	REORG: ssl: move macros and structure definitions to ssl_sock.h The ssl_sock.c file contains a lot of macros and structure definitions that should be in a .h. Move them to the more appropriate types/ssl_sock.h file.	2020-05-15 14:11:54 +02:00
Dragan Dosen	eb607fe6a1	MINOR: ssl: add a new function ssl_sock_get_ssl_object() This one can be used later to get a SSL object from connection. It will return NULL if connection is not established over SSL.	2020-05-14 13:13:14 +02:00
Dragan Dosen	1e7ed04665	MEDIUM: ssl: allow to register callbacks for SSL/TLS protocol messages This patch adds the ability to register callbacks for SSL/TLS protocol messages by using the function ssl_sock_register_msg_callback(). All registered callback functions will be called when observing received or sent SSL/TLS protocol messages.	2020-05-14 13:13:14 +02:00
Christopher Faulet	325504cf89	BUG/MINOR: sample/ssl: Fix digest converter for openssl < 1.1.0 The EVP_MD_CTX_create() and EVP_MD_CTX_destroy() functions were renamed to EVP_MD_CTX_new() and EVP_MD_CTX_free() in OpenSSL 1.1.0, respectively. These functions are used by the digest converter, introduced by the commit `8e36651ed` ("MINOR: sample: Add digest and hmac converters"). So for prior versions of openssl, macros are used to fallback on old functions. This patch must only be backported if the commit `8e36651ed` is backported too.	2020-05-12 16:30:41 +02:00
Willy Tarreau	5778fea4da	CLEANUP: remove THREAD_LOCAL from config.h This one really ought to be defined in hathreads.h like all other thread definitions, which is what this patch does. As expected, all files but one (regex.h) were already including hathreads.h when using THREAD_LOCAL; regex.h was fixed for this. This was the last entry in config.h which is now useless.	2020-05-09 09:08:09 +02:00
Willy Tarreau	3bc4e8bfe6	CLENAUP: config: move CONFIG_HAP_LOCKLESS_POOLS out of config.h The setting of CONFIG_HAP_LOCKLESS_POOLS depending on threads and compat was done in config.h for use only in memory.h and memory.c where other settings are dealt with. Further, the default pool cache size was set there from a fixed value instead of being set from defaults.h Let's move the decision to enable lockless pools via CONFIG_HAP_LOCKLESS_POOLS to memory.h, and set the default pool cache size in defaults.h like other default settings. This was the next-to-last setting in config.h.	2020-05-09 09:02:35 +02:00
Willy Tarreau	755afc08d5	CLEANUP: config: drop unused setting CONFIG_HAP_INLINE_FD_SET CONFIG_HAP_INLINE_FD_SET was introduced in 1.3.3 and dropped in 1.3.9 when the pollers were reworked, let's remove it.	2020-05-09 08:57:48 +02:00
Willy Tarreau	571eb3d659	CLEANUP: config: drop unused setting CONFIG_HAP_MEM_OPTIM CONFIG_HAP_MEM_OPTIM was introduced with memory pools in 1.3 and dropped in 1.6 when pools became the only way to allocate memory. Still the option remained present in config.h. Let's kill it.	2020-05-09 08:53:31 +02:00
Christopher Faulet	67a234583e	CLEANUP: checks: sort and rename tcpcheck_expect_type types The same naming format is used for all expect rules. And names are sorted to be grouped by type.	2020-05-06 12:38:44 +02:00
Christopher Faulet	aaab0836d9	MEDIUM: checks: Add matching on log-format string for expect rules It is now possible to use log-format string (or hexadecimal string for the binary version) to match a content in tcp-check based expect rules. For hexadecimal log-format string, the conversion in binary is performed after the string evaluation, during health check execution. The pattern keywords to use are "string-lf" for the log-format string and "binary-lf" for the hexadecimal log-format string.	2020-05-06 08:31:29 +02:00
Willy Tarreau	a4d9ee3d1c	BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}() Just like in previous patch, it happens that HA_ATOMIC_UPDATE_MIN() and HA_ATOMIC_UPDATE_MAX() would evaluate the (val) argument up to 3 times. However this time it affects both thread and non-thread versions. It's strange because the copy was properly performed for the (new) argument in order to avoid this. Anyway it was done for the "val" one as well. A quick code inspection showed that this currently has no effect as these macros are fairly limited in usage. It would be best to backport this for long-term stability (till 1.8) but it will not fix an existing bug.	2020-05-05 16:18:52 +02:00
Willy Tarreau	d66345d6b0	BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_CAS() When threads are disabled, HA_ATOMIC_CAS() becomes a simple compound expression. However this expression presents a problem, which is that its arguments are evaluated multiple times, once for the comparison and once again for the assignement. This presents a risk of performing some side-effect operations twice in the non-threaded case (e.g. in case of auto-increment or function return). The macro was rewritten using local copies for arguments like the other macros do. Fortunately a complete inspection of the code indicates that this case currently never happens. It was however responsible for the strict-aliasing warning emitted when building fd.c without threads but with 64-bit CAS. This may be backported as far as 1.8 though it will not fix any existing bug and is more of a long-term safety measure in case a future fix would depend on this behavior.	2020-05-05 16:05:45 +02:00
Baptiste Assmann	0e9d87bf06	MINOR: istbuf: add ist2buf() function Purpose of this function is to build a <struct buffer> from a <struct ist>.	2020-05-05 15:28:59 +02:00
Baptiste Assmann	de80201460	MINOR: ist: add istissame() function The istissame() function takes 2 ist and compare their <.ptr> and <.len> values respectively. It returns non-zero if they are the same.	2020-05-05 15:28:59 +02:00
Baptiste Assmann	9ef1967af7	MINOR: ist: add istadv() function The purpose of istadv() function is to move forward <.ptr> by <nb> characters. It is very useful when parsing a payload.	2020-05-05 15:28:59 +02:00
Christopher Faulet	3970819a55	MEDIUM: checks: Support matching on headers for http-check expect rules It is now possible to add http-check expect rules matching HTTP header names and values. Here is the format of these rules: http-check expect header name [ -m <meth> ] <name> [log-format] \ [ value [ -m <meth> ] <value> [log-format] [full] ] the name pattern (name ...) is mandatory but the value pattern (value ...) is optionnal. If not specified, only the header presence is verified. <meth> is the matching method, applied on the header name or the header value. Supported matching methods are: * "str" (exact match) * "beg" (prefix match) * "end" (suffix match) * "sub" (substring match) * "reg" (regex match) If not specified, exact matching method is used. If the "log-format" option is used, the pattern (<name> or <value>) is evaluated as a log-format string. This option cannot be used with the regex matching method. Finally, by default, the header value is considered as comma-separated list. Each part may be tested. The "full" option may be used to test the full header line. Note that matchings are case insensitive on the header names.	2020-05-05 11:19:27 +02:00
Christopher Faulet	8dd33e13a5	MINOR: http-htx: Support different methods to look for header names It is now possible to use different matching methods to look for header names in an HTTP message: * The exact match. It is the default method. http_find_header() uses this method. http_find_str_header() is an alias. * The prefix match. It evals the header names starting by a prefix. http_find_pfx_header() must be called to use this method. * The suffix match. It evals the header names ending by a suffix. http_find_sfx_header() must be called to use this method. * The substring match. It evals the header names containing a string. http_find_sub_header() must be called to use this method. * The regex match. It evals the header names matching a regular expression. http_match_header() must be called to use this method.	2020-05-05 11:07:00 +02:00
Christopher Faulet	778f5ed478	MEDIUM: checks/http-fetch: Support htx prefetch from a check for HTTP samples Some HTTP sample fetches will be accessible from the context of a http-check health check. Thus, the prefetch function responsible to return the HTX message has been update to handle a check, in addition to a channel. Both cannot be used at the same time. So there is no ambiguity.	2020-05-05 11:06:43 +02:00
Willy Tarreau	86c6a9221a	BUG/MEDIUM: shctx: bound the number of loops that can happen around the lock Given that a "count" value of 32M was seen in _shctx_wait4lock(), it is very important to prevent this from happening again. It's absolutely essential to prevent the value from growing unbounded because with an increase of the number of threads, the number of successive failed attempts will necessarily grow. Instead now we're scanning all 2^p-1 values from 3 to 255 and are bounding to count to 255 so that in the worst case each thread tries an xchg every 255 failed read attempts. That's one every 4 on average per thread when there are 64 threads, which corresponds to the initial count of 4 for the first attempt so it seems like a reasonable value to keep a low latency. The bug was introduced with the shctx entries in 1.5 so the fix must be backported to all versions. Before 1.8 the function was called _shared_context_wait4lock() and was in shctx.c.	2020-05-01 13:32:20 +02:00
Willy Tarreau	3801bdc3fc	BUG/MEDIUM: shctx: really check the lock's value while waiting J�r�me reported an amazing crash in the spinlock version of _shctx_wait4lock() with an extremely high <count> value of 32M! The root cause is that the function cannot deal with contention on the lock at all because it forgets to check if the lock's value has changed! As such, every time it's called due to a contention, it waits twice as long before trying again and lets the caller check for the contention by itself. The correct thing to do is to compare the value again at each loop. This way it makes sure to mostly perform read accesses on the shared cache line without writing too often, and to be ready fast enough to try to grab the lock. And we must not increase the count on success either! Unfortunately I'd have expected to see a performance boost on the cache with this but there was absolutely no change, so it's very likely that these issues only happen once in a while and are sufficient to derail the process when they strike, but not to have a permanent performance impact. The bug was introduced with the shctx entries in 1.5 so the fix must be backported to all versions. Before 1.8 the function was called _shared_context_wait4lock() and was in shctx.c.	2020-05-01 13:29:14 +02:00
Willy Tarreau	f0e5da20e1	BUG/MINOR: debug: properly use long long instead of long for the thread ID I changed my mind twice on this one and pushed after the last test with threads disabled, without re-enabling long long, causing this rightful build warning. This needs to be backported if the previous commit `ff64d3b027` ("MINOR: threads: export the POSIX thread ID in panic dumps") is backported as well.	2020-05-01 12:26:03 +02:00
Willy Tarreau	ff64d3b027	MINOR: threads: export the POSIX thread ID in panic dumps It is very difficult to map a panic dump against a gdb thread dump because the thread numbers do not match. However gdb provides the pthread ID but this one is supposed to be opaque and not to be cast to a scalar. This patch provides a fnuction, ha_get_pthread_id() which retrieves the pthread ID of the indicated thread and casts it to an unsigned long long so as to lose the least possible amount of information from it. This is done cleanly using a union to maintain alignment so as long as these IDs are stored on 1..8 bytes they will be properly reported. This ID is now presented in the panic dumps so it now becomes possible to map these threads. When threads are disabled, zero is returned. For example, this is a panic dump: Thread 1 is about to kill the process. >Thread 1 : id=0x7fe92b825180 act=0 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0 stuck=1 prof=0 harmless=0 wantrdv=0 cpu_ns: poll=5119122 now=2009446995 diff=2004327873 curr_task=0xc99bf0 (task) calls=4 last=0 fct=0x592440(task_run_applet) ctx=0xca9c50(<CLI>) strm=0xc996a0 src=unix fe=GLOBAL be=GLOBAL dst=<CLI> rqf=848202 rqa=0 rpf=80048202 rpa=0 sif=EST,200008 sib=EST,204018 af=(nil),0 csf=0xc9ba40,8200 ab=0xca9c50,4 csb=(nil),0 cof=0xbf0e50,1300:PASS(0xc9cee0)/RAW((nil))/unix_stream(20) cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0) call trace(20): \| 0x59e4cf [48 83 c4 10 5b 5d 41 5c]: wdt_handler+0xff/0x10c \| 0x7fe92c170690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690 \| 0x7ffce29519d9 [48 c1 e2 20 48 09 d0 48]: linux-vdso:+0x9d9 \| 0x7ffce2951d54 [eb d9 f3 90 e9 1c ff ff]: linux-vdso:__vdso_gettimeofday+0x104/0x133 \| 0x57b484 [48 89 e6 48 8d 7c 24 10]: main+0x157114 \| 0x50ee6a [85 c0 75 76 48 8b 55 38]: main+0xeaafa \| 0x50f69c [48 63 54 24 20 85 c0 0f]: main+0xeb32c \| 0x59252c [48 c7 c6 d8 ff ff ff 44]: task_run_applet+0xec/0x88c Thread 2 : id=0x7fe92b6e6700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=0 prof=0 harmless=1 wantrdv=0 cpu_ns: poll=786738 now=1086955 diff=300217 curr_task=0 Thread 3 : id=0x7fe92aee5700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=0 prof=0 harmless=1 wantrdv=0 cpu_ns: poll=828056 now=1129738 diff=301682 curr_task=0 Thread 4 : id=0x7fe92a6e4700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=0 prof=0 harmless=1 wantrdv=0 cpu_ns: poll=818900 now=1153551 diff=334651 curr_task=0 And this is the gdb output: (gdb) info thr Id Target Id Frame 1 Thread 0x7fe92b825180 (LWP 15234) 0x00007fe92ba81d6b in raise () from /lib64/libc.so.6 2 Thread 0x7fe92b6e6700 (LWP 15235) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6 3 Thread 0x7fe92a6e4700 (LWP 15237) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6 4 Thread 0x7fe92aee5700 (LWP 15236) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6 We can clearly see that while threads 1 and 2 are the same, gdb's threads 3 and 4 respectively are haproxy's threads 4 and 3. This may be backported to 2.0 as it removes some confusion in github issues.	2020-05-01 11:45:56 +02:00
Christopher Faulet	dc75d577b9	CLEANUP: checks: Fix checks includes	2020-04-29 13:32:29 +02:00
Christopher Faulet	1543d44607	MINOR: http-htx: Export functions to update message authority and host These functions will be used by HTTP health checks when a request is formatted before sending it.	2020-04-29 13:32:29 +02:00
Damien Claisse	57c8eb939d	MINOR: log: Add "Tu" timer It can be sometimes useful to measure total time of a request as seen from an end user, including TCP/TLS negotiation, server response time and transfer time. "Tt" currently provides something close to that, but it also takes client idle time into account, which is problematic for keep-alive requests as idle time can be very long. "Ta" is also not sufficient as it hides TCP/TLS negotiationtime. To improve that, introduce a "Tu" timer, without idle time and everything else. It roughly estimates time spent time spent from user point of view (without DNS resolution time), assuming network latency is the same in both directions.	2020-04-28 16:30:13 +02:00
Christopher Faulet	bfb0f72d52	BUG/MEDIUM: sessions: Always pass the mux context as argument to destroy a mux This bug was introduced by the commit `2444aa5b` ("MEDIUM: sessions: Don't be responsible for connections anymore."). In session_check_idle_conn(), when the mux is destroyed, its context must be passed as argument instead of the connection. It is de 2.2-dev bug. No need to backport.	2020-04-27 15:53:43 +02:00
Christopher Faulet	4a8c026117	BUG/MINOR: checks/server: use_ssl member must be signed	2020-04-27 12:13:06 +02:00
Christopher Faulet	8021a5f4a5	MINOR: checks: Support list of status codes on http-check expect rules It is now possible to match on a comma-separated list of status codes or range of codes. In addtion, instead of a string comparison to match the response's status code, a integer comparison is performed. Here is an example: http-check expect status 200,201,300-310	2020-04-27 10:46:28 +02:00
Christopher Faulet	88d939c831	Revert "MEDIUM: checks: capture groups in expect regexes" This reverts commit 1979943c30ef285ed04f07ecf829514de971d9b2. Captures in comment was only used when a tcp-check expect based on a negative regex matching failed to eventually report what was captured while it was not expected. It is a bit far-fetched to be useable IMHO. on-error and on-success log-format strings are far more usable. For now there is few check sample fetches (in fact only one...). But it could be really powerful to report info in logs.	2020-04-27 10:46:28 +02:00
Christopher Faulet	d7cee71e77	MINOR: checks: Use a tree instead of a list to store tcp-check rulesets Since all tcp-check rulesets are globally stored, it is a problem to use list. For configuration with many backends, the lookups in list may be costly and slow downs HAProxy startup. To solve this problem, tcp-check rulesets are now stored in a tree.	2020-04-27 10:46:28 +02:00
Christopher Faulet	0417975bdc	MINOR: ist: Add a function to retrieve the ist pointer There is already the istlen() function to get the ist length. Now, it is possible to call istptr() to get the ist pointer.	2020-04-27 10:46:28 +02:00
Christopher Faulet	61cc852230	CLEANUP: checks: Reorg checks.c file to be more readable The patch is not obvious at the first glance. But it is just a reorg. Functions have been grouped and ordered in a more logical way. Some structures and flags are now private to the checks module (so moved from the .h to the .c file).	2020-04-27 10:46:28 +02:00
Christopher Faulet	d7e639661a	MEDIUM: checks: Implement default TCP check using tcp-check rules Defaut health-checks, without any option, doing only a connection check, are now based on tcp-checks. An implicit default tcp-check connect rule is used. A shared tcp-check ruleset, name "*tcp-check" is created to support these checks.	2020-04-27 10:46:28 +02:00
Christopher Faulet	a9e1c4c7c2	MINOR: connection: Add a function to install a mux for a health-check This function is unused for now. But it will have be used to install a mux for an outgoing connection openned in a health-check context. In this case, the session's origin is the check itself, and it is used to know the mode, HTTP or TCP, depending on the tcp-check type and not the proxy mode. The check is also used to get the mux protocol if configured.	2020-04-27 09:39:38 +02:00
Christopher Faulet	b356714769	MINOR: checks: Add a mux proto to health-check and tcp-check connect rule It is not set and not used for now, but it will be possible to force the mux protocol thanks to this patch. A mux proto field is added to the checks and to tcp-check connect rules.	2020-04-27 09:39:38 +02:00
Christopher Faulet	a142c1deb4	BUG/MINOR: obj_type: Handle stream object in obj_base_ptr() function The stream object (OBJ_TYPE_STREAM) was missing in the switch statement of the obj_base_ptr() function. This patch must be backported as far as 2.0.	2020-04-27 09:39:38 +02:00
Christopher Faulet	3829046893	MINOR: checks/obj_type: Add a new object type for checks An object type is now affected to the check structure.	2020-04-27 09:39:38 +02:00
Christopher Faulet	e60abd1a06	MINOR: connection: Add macros to know if a conn or a cs uses an HTX mux IS_HTX_CONN() and IS_HTX_CS may now be used to know if a connection or a conn-stream use an HTX based multiplexer.	2020-04-27 09:39:38 +02:00
Christopher Faulet	e5870d872b	MAJOR: checks: Implement HTTP check using tcp-check rules HTTP health-checks are now internally based on tcp-checks. Of course all the configuration parsing of the "http-check" keyword and the httpchk option has been rewritten. But the main changes is that now, as for tcp-check ruleset, it is possible to perform several send/expect sequences into the same health-checks. Thus the connect rule is now also available from HTTP checks, jst like set-var, unset-var and comment rules. Because the request defined by the "option httpchk" line is used for the first request only, it is now possible to set the method, the uri and the version on a "http-check send" line.	2020-04-27 09:39:38 +02:00
Christopher Faulet	5eb96cbcbc	MINOR: standard: Add my_memspn and my_memcspn Do the same than strsnp() and strcspn() but on a raw bytes buffer.	2020-04-27 09:39:38 +02:00
Christopher Faulet	12d5740a38	MINOR: checks: Introduce flags to configure in tcp-check expect rules Instead of having 2 independent integers, used as boolean values, to know if the expect rule is invered and to know if the matching regexp has captures, we know use a 32-bits bitfield.	2020-04-27 09:39:38 +02:00
Christopher Faulet	f930e4c4df	MINOR: checks: Use an indirect string to represent the expect matching string Instead of having a string in the expect union with its length outside of the union, directly in the expect structure, an indirect string is now used.	2020-04-27 09:39:38 +02:00
Christopher Faulet	404f919995	MEDIUM: checks: Use a shared ruleset to store tcp-check rules All tcp-check rules are now stored in the globla shared list. The ones created to parse a specific protocol, for instance redis, are already stored in this list. Now pure tcp-check rules are also stored in it. The ruleset name is created using the proxy name and its config file and line. tcp-check rules declared in a defaults section are also stored this way using "defaults" as proxy name. For now, all tcp-check ruleset are stored in a list. But it could be a bit slow to looks for a specific ruleset with a huge number of backends. So, it could be a good idea to use a tree instead.	2020-04-27 09:39:38 +02:00
Christopher Faulet	6f5579160a	MINOR: proxy/checks: Move parsing of external-check option in checks.c Parsing of the proxy directive "option external-check" have been moved in checks.c.	2020-04-27 09:39:38 +02:00
Christopher Faulet	430e480510	MINOR: proxy/checks: Move parsing of tcp-check option in checks.c Parsing of the proxy directive "option tcp-check" have been moved in checks.c.	2020-04-27 09:39:38 +02:00
Christopher Faulet	6c2a743538	MINOR: proxy/checks: Move parsing of httpchk option in checks.c Parsing of the proxy directive "option httpchk" have been moved in checks.c.	2020-04-27 09:39:38 +02:00
Christopher Faulet	ec07e386a7	MINOR: checks: Add an option to set success status of tcp-check expect rules It is now possible to specified the healthcheck status to use on success of a tcp-check rule, if it is the last evaluated rule. The option "ok-status" supports "L4OK", "L6OK", "L7OK" and "L7OKC" status.	2020-04-27 09:39:38 +02:00
Christopher Faulet	799f3a4621	MINOR: Produce tcp-check info message for pure tcp-check rules only This way, messages reported by protocol checks are closer that the old one.	2020-04-27 09:39:38 +02:00
Christopher Faulet	0ae3d1dbdf	MEDIUM: checks: Implement agent check using tcp-check rules A shared tcp-check ruleset is now created to support agent checks. The following sequence is used : tcp-check send "%[var(check.agent_string)] log-format tcp-check expect custom The custom function to evaluate the expect rule does the same that it was done to handle agent response when a custom check was used.	2020-04-27 09:39:38 +02:00
Christopher Faulet	267b01b761	MEDIUM: checks: Implement SPOP check using tcp-check rules A share tcp-check ruleset is now created to support SPOP checks. This way no extra memory is used if several backends use a SPOP check. The following sequence is used : tcp-check send-binary SPOP_REQ tcp-check expect custom min-recv 4 The spop request is the result of the function spoe_prepare_healthcheck_request() and the expect rule relies on a custom function calling spoe_handle_healthcheck_response().	2020-04-27 09:39:38 +02:00
Christopher Faulet	1997ecaa0c	MEDIUM: checks: Implement LDAP check using tcp-check rules A shared tcp-check ruleset is now created to support LDAP check. This way no extra memory is used if several backends use a LDAP check. The following sequance is used : tcp-check send-binary "300C020101600702010304008000" tcp-check expect rbinary "^30" min-recv 14 \ on-error "Not LDAPv3 protocol" tcp-check expect custom The last expect rule relies on a custom function to check the LDAP server reply.	2020-04-27 09:39:38 +02:00
Christopher Faulet	f2b3be5c27	MEDIUM: checks: Implement MySQL check using tcp-check rules A share tcp-check ruleset is now created to support MySQL checks. This way no extra memory is used if several backends use a MySQL check. One for the following sequence is used : ## If no extra params are set tcp-check connect default linger tcp-check expect custom ## will test the initial handshake ## If the username is defined tcp-check connect default linger tcp-check send-binary MYSQL_REQ log-format tcp-check expect custom ## will test the initial handshake tcp-check expect custom ## will test the reply to the client message The log-format hexa string MYSQL_REQ depends on 2 preset variables, the packet header containing the packet length and the sequence ID (check.header) and the username (check.username). If is also different if the "post-41" option is set or not. Expect rules relies on custom functions to check MySQL server packets.	2020-04-27 09:39:38 +02:00
Christopher Faulet	ce355074f1	MEDIUM: checks: Implement postgres check using tcp-check rules A shared tcp-check ruleset is now created to support postgres check. This way no extra memory is used if several backends use a pgsql check. The following sequence is used : tcp-check connect default linger tcp-check send-binary PGSQL_REQ log-format tcp-check expect !rstring "^E" min-recv 5 \ error-status "L7RSP" on-error "%[check.payload(6,0)]" tcp-check expect rbinary "^520000000800000000 min-recv "9" \ error-status "L7STS" \ on-success "PostgreSQL server is ok" \ on-error "PostgreSQL unknown error" The log-format hexa string PGSQL_REQ depends on 2 preset variables, the packet length (check.plen) and the username (check.username).	2020-04-27 09:39:38 +02:00
Christopher Faulet	fbcc77c6ba	MEDIUM: checks: Implement smtp check using tcp-check rules A share tcp-check ruleset is now created to support smtp checks. This way no extra memory is used if several backends use a smtp check. The following sequence is used : tcp-check connect default linger tcp-check expect rstring "^[0-9]{3}[ \r]" min-recv 4 \ error-status "L7RSP" on-error "%[check.payload(),cut_crlf]" tcp-check expect rstring "^2[0-9]{2}[ \r]" min-recv 4 \ error-status "L7STS" \ on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \ status-code "check.payload(0,3)" tcp-echeck send "%[var(check.smtp_cmd)]\r\n" log-format tcp-check expect rstring "^2[0-9]{2}[- \r]" min-recv 4 \ error-status "L7STS" \ on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \ on-success "%[check.payload(4,0),ltrim(' '),cut_crlf]" \ status-code "check.payload(0,3)" The variable check.smtp_cmd is by default the string "HELO localhost" by may be customized setting <helo> and <domain> parameters on the option smtpchk line. Note there is a difference with the old smtp check. The server gretting message is checked before send the HELO/EHLO comand.	2020-04-27 09:39:38 +02:00
Christopher Faulet	811f78ced1	MEDIUM: checks: Implement ssl-hello check using tcp-check rules A shared tcp-check ruleset is now created to support ssl-hello check. This way no extra memory is used if several backends use a ssl-hello check. The following sequence is used : tcp-check send-binary SSLV3_CLIENT_HELLO log-format tcp-check expect rbinary "^1[56]" min-recv 5 \ error-status "L6RSP" tout-status "L6TOUT" SSLV3_CLIENT_HELLO is a log-format hexa string representing a SSLv3 CLIENT HELLO packet. It is the same than the one used by the old ssl-hello except the sample expression "%[date(),htonl,hex]" is used to set the date field.	2020-04-27 09:39:38 +02:00
Christopher Faulet	33f05df650	MEDIUM: checks: Implement redis check using tcp-check rules A share tcp-check ruleset is now created to support redis checks. This way no extra memory is used if several backends use a redis check. The following sequence is used : tcp-check send "*1\r\n$4\r\nPING\r\n" tcp-check expect string "+PONG\r\n" error-status "L7STS" \ on-error "%[check.payload(),cut_crlf]" on-success "Redis server is ok"	2020-04-27 09:39:38 +02:00
Christopher Faulet	9e6ed1598e	MINOR: checks: Support custom functions to eval a tcp-check expect rules It is now possible to set a custom function to evaluate a tcp-check expect rule. It is an internal and not documentd option because the right pointer of function must be set and it is not possible to express it in the configuration. It will be used to convert some protocol healthchecks to tcp-checks. Custom functions must have the following signature: enum tcpcheck_eval_ret (custom)(struct check , struct tcpcheck_rule *, int);	2020-04-27 09:39:38 +02:00
Christopher Faulet	6f87adcf20	MINOR: checks: Export the tcpcheck_eval_ret enum This enum will be used to define custom function for tcp-check expect rules.	2020-04-27 09:39:38 +02:00
Christopher Faulet	7a1e2e1823	MEDIUM: checks: Add a list of vars to set before executing a tpc-check ruleset A list of variables is now associated to each tcp-check ruleset. It is more a less a list of set-var expressions. This list may be filled during the configuration parsing. The listed variables will then be set during each execution of the tcp-check healthcheck, at the begining, before execution of the the first tcp-check rule. This patch is mandatory to convert all protocol checks to tcp-checks. It is a way to customize shared tcp-check rulesets.	2020-04-27 09:39:37 +02:00
Christopher Faulet	bb591a1a11	MINOR: checks: Relax the default option for tcp-check connect rules Now this option may be mixed with other options. This way, options on the server line are used but may be overridden by tcp-check connect options.	2020-04-27 09:39:37 +02:00
Christopher Faulet	98cc57cf5c	MEDIUM: checks: Add status-code sample expression on tcp-check expect rules This option defines a sample expression, evaluated as an integer, to set the status code (check->code) if a tcp-check healthcheck ends on the corresponding expect rule.	2020-04-27 09:39:37 +02:00
Christopher Faulet	be52b4de66	MEDIUM: checks: Add on-error/on-success option on tcp-check expect rules These options define log-format strings used to produce the info message if a tcp-check expect rule fails (on-error option) or succeeds (on-success option). For this last option, it must be the ending rule, otherwise the parameter is ignored.	2020-04-27 09:39:37 +02:00
Christopher Faulet	cf80f2f263	MINOR: checks: Add option to tcp-check expect rules to customize error status It is now possible to specified the healthcheck status to use on error or on timeout for tcp-check expect rules. First, to define the error status, the option "error-status" must be used followed by "L4CON", "L6RSP", "L7RSP" or "L7STS". Then, to define the timeout status, the option "tout-status" must be used followed by "L4TOUT", "L6TOUT" or "L7TOUT". These options will be used to convert specific protocol healthchecks (redis, pgsql...) to tcp-check ones. x	2020-04-27 09:39:37 +02:00
Christopher Faulet	1032059bd0	MINOR: checks: Use a name for the healthcheck status enum The enum defining all healthcheck status (HCHK_STATUS_*) is now named.	2020-04-27 09:39:37 +02:00
Christopher Faulet	5d503fcf5b	MEDIUM: checks: Add a shared list of tcp-check rules A global list to tcp-check ruleset can now be used to share common rulesets with all backends without any duplication. It is mandatory to convert all specific protocol checks (redis, pgsql...) to tcp-check healthchecks. To do so, a flag is now attached to each tcp-check ruleset to know if it is a shared ruleset or not. tcp-check rules defined in a backend are still directly attached to the proxy and not shared. In addition a second flag is used to know if the ruleset is inherited from the defaults section.	2020-04-27 09:39:37 +02:00
Christopher Faulet	f50f4e956f	MEDIUM: checks: Support log-format strings for tcp-check send rules An extra parameter for tcp-check send rules can be specified to handle the string or the hexa string as a log-format one. Using "log-format" option, instead of considering the data to send as raw data, it is parsed as a log-format string. Thus it is possible to call sample fetches to customize data sent to a server. Of course, because we have no stream attached to healthchecks, not all sample fetches are available. So be careful. tcp-check set-var(check.port) int(8000) tcp-check set-var(check.uri) str(/status) tcp-check connect port var(check.port) tcp-check send "GET %[check.uri] HTTP/1.0\r\n" log-format tcp-check send "Host: %[srv_name]\r\n" log-format tcp-check send "\r\n"	2020-04-27 09:39:37 +02:00
Christopher Faulet	b7d30098f3	MEDIUM: checks: Support expression to set the port Since we have a session attached to tcp-check healthchecks, It is possible use sample expression and variables. In addition, it is possible to add tcp-check set-var rules to define custom variables. So, now, a sample expression can be used to define the port to use to establish a connection for a tcp-check connect rule. For instance: tcp-check set-var(check.port) int(8888) tcp-check connect port var(check.port)	2020-04-27 09:39:37 +02:00
Christopher Faulet	5c28874a69	MINOR: checks: Add the addr option for tcp-check connect rule With this option, it is now possible to use a specific address to open the connection for a tcp-check connect rule. If the port option is also specified, it is used in priority.	2020-04-27 09:39:37 +02:00
Christopher Faulet	d75f57e94c	MINOR: ssl: Export a generic function to parse an alpn string Parsing of an alpn string has been moved in a dedicated function and exposed to be used from outside the ssl_sock module.	2020-04-27 09:39:37 +02:00
Christopher Faulet	085426aea9	MINOR: checks: Add the via-socks4 option for tcp-check connect rules With this option, it is possible to establish the connection opened by a tcp-check connect rule using upstream socks4 proxy. Info from the socks4 parameter on the server are used.	2020-04-27 09:39:37 +02:00
Christopher Faulet	79b31d4ee5	MINOR: checks: Add the sni option for tcp-check connect rules With this option, it is possible to specify the SNI to be used for SSL conncection opened by a tcp-check connect rule.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	707b52f17e	MEDIUM: checks: Parse custom action rules in tcp-checks Register the custom action rules "set-var" and "unset-var", that will call the parse_store() command upon parsing. These rules are thus built and integrated to the tcp-check ruleset, but have no further effect for the moment.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	13a5043a9e	MINOR: checks/vars: Add a check scope for variables Add a dedicated vars scope for checks. This scope is considered as part of the session scope for accounting purposes. The scope can be addressed by a valid session, even embryonic. The stream is not necessary. The scope is initialized after the check session is created. All variables are then pruned before the session is destroyed.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	05d692dc09	MEDIUM: checks: Associate a session to each tcp-check healthcheck Create a session for each healthcheck relying on a tcp-check ruleset. When such check is started, a session is allocated, which will be freed when the check finishes. A dummy static frontend is used to create these sessions. This will be useful to support variables and sample expression. This will also be used, later, by HTTP healthchecks to rely on HTTP muxes.	2020-04-27 09:39:37 +02:00
Christopher Faulet	b2c2e0fcca	MAJOR: checks: Refactor and simplify the tcp-check loop The loop in tcpcheck_main() function is quite hard to understand. Depending where we are in the loop, The current_step is the currentely executed rule or the one to execute on the next call to tcpcheck_main(). When the check result is reported, we rely on the rule pointed by last_started_step or the one pointed by current_step. In addition, the loop does not use the common list_for_each_entry macro and it is thus quite confusing. So the loop has been totally rewritten and splitted to several functions to simplify its reading and its understanding. Tcp-check rules are evaluated in dedicated functions. And a common for_each loop is used and only one rule is referenced, the current one.	2020-04-27 09:39:37 +02:00
Christopher Faulet	a202d1d4c1	MEDIUM: checks: Add implicit tcp-check connect rule After the configuration parsing, when its validity check, an implicit tcp-check connect rule is added in front of the tcp-check ruleset if the first non-comment rule is not a connect one. This implicit rule is flagged to use the default check parameter. This means now, all tcp-check rulesets begin with a connect and are never empty. When tcp-check healthchecks are used, all connections are thus handled by tcpcheck_main() function.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	06d963aeca	MINOR: checks: define a tcp-check connect type The check rule itself is not changed, only its representation.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	48219dc50e	MINOR: checks: define tcp-check send type The check rule itself is not changed, only its representation.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	5301b01f99	MINOR: checks: Set the tcp-check rule index during parsing Now the position of a tcp-check rule in a chain is set during the parsing. This simplify significantly the function retrieving the current step id.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	04578dbf37	MINOR: checks: Don't use a static tcp rule list head To allow reusing these blocks without consuming more memory, their list should be static and share-able accross uses. The head of the list will be shared as well. It is thus necessary to extract the head of the rule list from the proxy itself. Transform it into a pointer instead, that can be easily set to an external dynamically allocated head.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	9dcb09fc98	MEDIUM: checks: capture groups in expect regexes Parse back-references in comments of tcp-check expect rules. If references are made, capture groups in the match and replace references to it within the comment when logging the error. Both text and binary regex can caputre groups and reference them in the expect rule comment. [Cf: I slightly updated the patch. exp_replace() function is used instead of a custom one. And if the trash buffer is too small to contain the comment during the substitution, the comment is ignored.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	efab6c61d9	MINOR: checks: add rbinary expect match type The rbinary match works similarly to the rstring match type, however the received data is rewritten as hex-string before the match operation is done. This allows using regexes on binary content even with the POSIX regex engine. [Cf: I slightly updated the patch. mem2hex function was removed and dump_binary is used instead.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	b616add793	MINOR: checks: define a tcp expect type Extract the expect definition from its tcpcheck ; create a standalone type.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	f8ba6773e5	MINOR: checks: add linger option to tcp connect Allow declaring tcpcheck connect commands with a new parameter, "linger". This option will configure the connection to avoid using an RST segment to close, instead following the four-way termination handshake. Some servers would otherwise log each healthcheck as an error.	2020-04-27 09:39:37 +02:00
Gaetan Rivet	1afd826ae4	MINOR: checks: add min-recv tcp-check expect option Some expect rules cannot be satisfied due to inherent ambiguity towards the received data: in the absence of match, the current behavior is to be forced to wait either the end of the connection or a buffer full, whichever comes first. Only then does the matching diagnostic is considered conclusive. For instance : tcp-check connect tcp-check expect !rstring "^error" tcp-check expect string "valid" This check will only succeed if the connection is closed by the server before the check timeout. Otherwise the first expect rule will wait for more data until "^error" regex matches or the check expires. Allow the user to explicitly define an amount of data that will be considered enough to determine the value of the check. This allows succeeding on negative rstring rules, as previously in valid condition no match happened, and the matching was repeated until the end of the connection. This could timeout the check while no error was happening. [Cf: I slighly updated the patch. The parameter was renamed and the value is a signed integer to support -1 as default value to ignore the parameter.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	4038b94706	MEDIUM: checks: rewind to the first inverse expect rule of a chain on new data When receiving additional data while chaining multiple tcp-check expects, previous inverse expects might have a different result with the new data. They need to be evaluated again against the new data. Add a pointer to the first inverse expect rule of the current expect chain (possibly of length one) to each expect rule. When receiving new data, the currently evaluated tcp-check rule is set back to this pointed rule. Fonctionnaly speaking, it is a bug and it exists since the introduction of the feature. But there is no way for now to hit it because when an expect rule does not match, we wait for more data, independently on the inverse flag. The only way to move to the following rule is to be sure no more data will be received. This patch depends on the commit "MINOR: mini-clist: Add functions to iterate backward on a list". [Cf: I slightly updated the patch. First, it only concerns inverse expect rule. Normal expect rules are not concerned. Then, I removed the BUG tag because, for now, it is not possible to move to the following rule when the current one does not match while more data can be received.]	2020-04-27 09:39:37 +02:00
Gaetan Rivet	dd66732ffe	MINOR: checks: Use an enum to describe the tcp-check rule type Replace the generic integer with an enumerated list. This allows light type check and helps debugging (seeing action = 2 in the struct is not helpful).	2020-04-27 09:39:37 +02:00
Christopher Faulet	31c30fdf1e	CLEANUP: checks: Don't export anymore init_check and srv_check_healthcheck_port These functions are no longer called outside the checks.	2020-04-27 09:39:37 +02:00
Christopher Faulet	f61f33a1b2	BUG/MINOR: checks: Respect the no-check-ssl option This options is used to force a non-SSL connection to check a SSL server or to invert a check-ssl option inherited from the default section. The use_ssl field in the check structure is used to know if a SSL connection must be used (use_ssl=1) or not (use_ssl=0). The server configuration is used by default. The problem is that we cannot distinguish the default case (no specific SSL check option) and the case of an explicit non-SSL check. In both, use_ssl is set to 0. So the server configuration is always used. For a SSL server, when no-check-ssl option is set, the check is still performed using a SSL configuration. To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value : * 0 = use server config * 1 = force SSL * -1 = force non-SSL The same is done for the server parameter. It is not really necessary for now. But it is a good way to know is the server no-ssl option is set. In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl to 1 for a check. Instead the flag is directly tested to prepare or destroy the server SSL context. This patch should be backported as far as 1.8.	2020-04-27 09:39:37 +02:00
Christopher Faulet	8acb1284bc	MINOR: checks: Add a way to send custom headers and payload during http chekcs The 'http-check send' directive have been added to add headers and optionnaly a payload to the request sent during HTTP healthchecks. The request line may be customized by the "option httpchk" directive but there was not official way to add extra headers. An old trick consisted to hide these headers at the end of the version string, on the "option httpchk" line. And it was impossible to add an extra payload with an "http-check expect" directive because of the "Connection: close" header appended to the request (See issue #16 for details). So to make things official and fully support payload additions, the "http-check send" directive have been added : option httpchk POST /status HTTP/1.1 http-check send hdr Content-Type "application/json;charset=UTF-8" \ hdr X-test-1 value1 hdr X-test-2 value2 \ body "{id: 1, field: \"value\"}" When a payload is defined, the Content-Length header is automatically added. So chunk-encoded requests are not supported yet. For now, there is no special validity checks on the extra headers. This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and as far as possible, it may be backported, at least as far as 1.8.	2020-04-27 09:39:37 +02:00
Christopher Faulet	bc1f54b0fc	MINOR: mini-clist: Add functions to iterate backward on a list list_for_each_entry_rev() and list_for_each_entry_from_rev() and corresponding safe versions have been added to iterate on a list in the reverse order. All these functions work the same way than the forward versions, except they use the .p field to move for an element to another.	2020-04-27 09:39:37 +02:00
Christopher Faulet	aaae9a0e99	BUG/MINOR: check: Update server address and port to execute an external check Server address and port may change at runtime. So the address and port passed as arguments and as environment variables when an external check is executed must be updated. The current number of connections on the server was already updated before executing the command. So the same mechanism is used for the server address and port. But in addition, command arguments are also updated. This patch must be backported to all stable versions. It should fix the issue #577.	2020-04-27 09:39:13 +02:00
Willy Tarreau	62ba9ba6ca	BUG/MINOR: http: make url_decode() optionally convert '+' to SP The url_decode() function used by the url_dec converter and a few other call points is ambiguous on its processing of the '+' character which itself isn't stable in the spec. This one belongs to the reserved characters for the query string but not for the path nor the scheme, in which it must be left as-is. It's only in argument strings that follow the application/x-www-form-urlencoded encoding that it must be turned into a space, that is, in query strings and POST arguments. The problem is that the function is used to process full URLs and paths in various configs, and to process query strings from the stats page for example. This patch updates the function to differentiate the situation where it's parsing a path and a query string. A new argument indicates if a query string should be assumed, otherwise it's only assumed after seeing a question mark. The various locations in the code making use of this function were updated to take care of this (most call places were using it to decode POST arguments). The url_dec converter is usually called on path or url samples, so it needs to remain compatible with this and will default to parsing a path and turning the '+' to a space only after a question mark. However in situations where it would explicitly be extracted from a POST or a query string, it now becomes possible to enforce the decoding by passing a non-null value in argument. It seems to be what was reported in issue #585. This fix may be backported to older stable releases.	2020-04-23 20:03:27 +02:00
Willy Tarreau	09568fd54d	BUG/MINOR: tools: fix the i386 version of the div64_32 function As reported in issue #596, the edx register isn't marked as clobbered in div64_32(), which could technically allow gcc to try to reuse it if it needed a copy of the 32 highest bits of the o1 register after the operation. Two attempts were tried, one using a dummy 32-bit local variable to store the intermediary edx and another one switching to "=A" and making result a long long. It turns out the former makes the resulting object code significantly dirtier while the latter makes it better and was kept. This is due to gcc's difficulties at working with register pairs mixing 32- and 64- bit values on i386. It was verified that no code change happened at all on x86_64, armv7, aarch64 nor mips32. In practice it's only used by the frequency counters so this bug cannot even be triggered but better fix it. This may be backported to stable branches though it will not fix any issue.	2020-04-23 17:21:37 +02:00
Ilya Shipitsin	856aabcda5	CLEANUP: assorted typo fixes in the code and comments This is 8th iteration of typo fixes	2020-04-17 09:37:36 +02:00
Willy Tarreau	bb86986253	MINOR: init: report the haproxy version and executable path once on errors If haproxy fails to start and emits an alert, then it can be useful to have it also emit the version and the path used to load it. Some users may be mistakenly launching the wrong binary due to a misconfigured PATH variable and this will save them some troubleshooting time when it reports that some keywords are not understood. What we do here is that we try to extract the binary name from the AUX vector on glibc, and we report this as a NOTICE tag before the very first alert is emitted.	2020-04-16 10:52:41 +02:00
Ilya Shipitsin	d425950c68	CLEANUP: assorted typo fixes in the code and comments This is 7th iteration of typo fixes	2020-04-16 10:04:36 +02:00
Willy Tarreau	3eb10b8e98	MINOR: init: add -dW and "zero-warning" to reject configs with warnings Since some systems switched to service managers which hide all warnings by default, some users are not aware of some possibly important warnings and get caught too late with errors that could have been detected earlier. This patch adds a new global keyword, "zero-warning" and an equivalent command-line option "-dW" to refuse to start in case any warning is detected. It is recommended to use these with configurations that are managed by humans in order to catch mistakes very early.	2020-04-15 16:42:39 +02:00
Willy Tarreau	bebd212064	MINOR: init: report in "haproxy -c" whether there were warnings or not This helps quickly checking if the config produces any warning. For this we reuse the "warned" bit field to add a new WARN_ANY bit that is set by ha_warning(). The rest of the bit field was also cleaned from unused bits.	2020-04-15 16:42:00 +02:00
Fr�d�ric L�caille	8ba10fea69	BUG/MINOR: peers: Incomplete peers sections should be validated. Before supporting "server" line in "peers" section, such sections without any local peer were removed from the configuration to get it validated. This patch fixes the issue where a "server" line without address and port which is a remote peer without address and port makes the configuration parsing fail. When encoutering such cases we now ignore such lines remove them from the configuration. Thank you to J�r�me Magnin for having reported this bug. Must be backported to 2.1 and 2.0.	2020-04-15 10:47:39 +02:00
William Lallemand	b7296c42bd	CLEANUP: ssl: remove a commentary in struct ckch_inst The struct ckch_inst now handles the ssl_bind_conf so this commentary is obsolete	2020-04-09 16:13:42 +02:00
William Lallemand	caa161982f	CLEANUP: ssl/cli: use the list of filters in the crtlist_entry In 'commit ssl cert', instead of trying to regenerate a list of filters from the SNIs, use the list provided by the crtlist_entry used to generate the ckch_inst. This list of filters doesn't need to be free'd anymore since they are always reused from the crtlist_entry.	2020-04-08 16:52:51 +02:00
William Lallemand	02e19a5c7b	CLEANUP: ssl: use the refcount for the SSL_CTX' Use the refcount of the SSL_CTX' to free them instead of freeing them on certains conditions. That way we can free the SSL_CTX everywhere its pointer is used.	2020-04-08 16:52:51 +02:00
William Lallemand	c69f02d0f0	MINOR: ssl/cli: replace dump/show ssl crt-list by '-n' option The dump and show ssl crt-list commands does the same thing, they dump the content of a crt-list, but the 'show' displays an ID in the first column. Delete the 'dump' command so it is replaced by the 'show' one. The old 'show' command is replaced by an '-n' option to dump the ID. And the ID which was a pointer is replaced by a line number and placed after colons in the filename. Example: $ echo "show ssl crt-list -n kikyo.crt-list" \| socat /tmp/sock1 - # kikyo.crt-list kikyo.pem.rsa:1 secure.domain.tld kikyo.pem.ecdsa:2 secure.domain.tld	2020-04-06 19:33:33 +02:00
Fr�d�ric L�caille	876ed55d9b	BUG/MINOR: protocol_buffer: Wrong maximum shifting. This patch fixes a bad stop condition when decoding a protocol buffer variable integer whose maximum lenghts are 10, shifting a uint64_t value by more than 63. Thank you to Ilya for having reported this issue. Must be backported to 2.1 and 2.0.	2020-04-02 15:09:46 +02:00
Olivier Houchard	4a0e7fe4f7	MINOR: connections: Don't mark conn flags 0x00000001 and 0x00000002 as unused. Remove the comments saying 0x00000001 and 0x00000002 are unused, they are now used by CO_FL_SAFE_LIST and CO_FL_IDLE_LIST.	2020-03-31 23:04:20 +02:00
William Lallemand	fa8cf0c476	MINOR: ssl: store a ptr to crtlist in crtlist_entry Store a pointer to crtlist in crtlist_entry so we can re-insert a crtlist_entry in its crtlist ebpt after updating its key.	2020-03-31 12:32:17 +02:00
William Lallemand	23d61c00b9	MINOR: ssl: add a list of crtlist_entry in ckch_store When updating a ckch_store we may want to update its pointer in the crtlist_entry which use it. To do this, we need the list of the entries using the store.	2020-03-31 12:32:17 +02:00
William Lallemand	493983128b	BUG/MINOR: ssl: ckch_inst wrongly inserted in crtlist_entry The instances were wrongly inserted in the crtlist entries, all instances of a crt-list were inserted in the last crt-list entry. Which was kind of handy to free all instances upon error. Now that it's done correctly, the error path was changed, it must iterate on the entries and find the ckch_insts which were generated for this bind_conf. To avoid wasting time, it stops the iteration once it found the first unsuccessful generation.	2020-03-31 12:32:17 +02:00
William Lallemand	ad3c37b760	REORG: ssl: move SETCERT enum to ssl_sock.h Move the SETCERT enum at the right place to cleanup ssl_sock.c.	2020-03-31 12:32:17 +02:00
William Lallemand	79d31ec0d4	MINOR: ssl: add a list of bind_conf in struct crtlist In order to be able to add new certificate in a crt-list, we need the list of bind_conf that uses this crt-list so we can create a ckch_inst for each of them.	2020-03-31 12:32:17 +02:00
William Lallemand	638f6ad033	MINOR: cli: add a general purpose pointer in the CLI struct This patch adds a p2 generic pointer which is inialized to zero before calling the parser.	2020-03-31 12:32:17 +02:00
Olivier Houchard	cf612a0457	MINOR: servers: Add a counter for the number of currently used connections. Add a counter to know the current number of used connections, as well as the max, this will be used later to refine the algorithm used to kill idle connections, based on current usage.	2020-03-30 00:30:01 +02:00
Jerome Magnin	824186bb08	MEDIUM: stream: support use-server rules with dynamic names With server-template was introduced the possibility to scale the number of servers in a backend without needing a configuration change and associated reload. On the other hand it became impractical to write use-server rules for these servers as they would only accept existing server labels as argument. This patch allows the use of log-format notation to describe targets of a use-server rules, such as in the example below: listen test bind *:1234 use-server %[hdr(srv)] if { hdr(srv) -m found } use-server s1 if { path / } server s1 127.0.0.1:18080 server s2 127.0.0.1:18081 If a use-server rule is applied because it was conditionned by an ACL returning true, but the target of the use-server rule cannot be resolved, no other use-server rule is evaluated and we fall back to load balancing. This feature was requested on the ML, and bumped with issue #563.	2020-03-29 09:55:10 +02:00
Olivier Houchard	dbda31939d	BUG/MINOR: connections: Set idle_time before adding to idle list. In srv_add_to_idle_list(), make sure we set the idle_time before we add the connection to an idle list, not after, otherwise another thread may grab it, set the idle_time to 0, only to have the original thread set it back to now_ms. This may have an impact, as in conn_free() we check idle_time to decide if we should decrement the idle connection counters for the server.	2020-03-22 20:05:59 +01:00
Olivier Houchard	ad91124bcf	BUILD/MEDIUM: fd: Declare fd_mig_lock as extern. Declare fd_mig_lock as extern so that it isn't defined multiple times. This should fix build for architectures without double-width CAS.	2020-03-20 11:42:11 +01:00
Olivier Houchard	566df309c6	MEDIUM: connections: Attempt to get idle connections from other threads. In connect_server(), if we no longer have any idle connections for the current thread, attempt to use the new "takeover" mux method to steal a connection from another thread. This should have no impact right now, given no mux implements it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	d2489e00b0	MINOR: connections: Add a flag to know if we're in the safe or idle list. Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one know we are in the safe list, or the idle list.	2020-03-19 22:07:33 +01:00
Olivier Houchard	f0d4dff25c	MINOR: connections: Make the "list" element a struct mt_list instead of list. Make the "list" element a struct mt_list, and explicitely use list_from_mt_list to get a struct list * where it is used as such, so that mt_list_for_each_entry will be usable with it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	00bdce24d5	MINOR: connections: Add a new mux method, "takeover". Add a new mux method, "takeover", that will attempt to make the current thread responsible for the connection. It should return 0 on success, and non-zero on failure.	2020-03-19 22:07:33 +01:00
Olivier Houchard	8851664293	MINOR: fd: Implement fd_takeover(). Implement a new function, fd_takeover(), that lets you become the thread responsible for the fd. On architectures that do not have a double-width CAS, use a global rwlock. fd_set_running() was also changed to be able to compete with fd_takeover(), either using a dooble-width CAS on both running_mask and thread_mask, or by claiming a reader on the global rwlock. This extra operation should not have any measurable impact on modern architectures where threading is relevant.	2020-03-19 22:07:33 +01:00
Olivier Houchard	dc2f2753e9	MEDIUM: servers: Split the connections into idle, safe, and available. Revamp the server connection lists. We know have 3 lists : - idle_conns, which contains idling connections - safe_conns, which contains idling connections that are safe to use even for the first request - available_conns, which contains connections that are not idling, but can still accept new streams (those are HTTP/2 or fastcgi, and are always considered safe).	2020-03-19 22:07:33 +01:00
Olivier Houchard	2444aa5b66	MEDIUM: sessions: Don't be responsible for connections anymore. Make it so sessions are not responsible for connection anymore, except for connections that are private, and thus can't be shared, otherwise, as soon as a request is done, the session will just add the connection to the orphan connections pool. This will break http-reuse safe, but it is expected to be fixed later.	2020-03-19 22:07:33 +01:00
Olivier Houchard	899fb8abdc	MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc. The flush_lock was introduced, mostly to be sure that pool_gc() will never dereference a pointer that has been free'd. __pool_get_first() was acquiring the lock to, the fear was that otherwise that pointer could get free'd later, and then pool_gc() would attempt to dereference it. However, that can not happen, because the only functions that can free a pointer, when using lockless pools, are pool_gc() and pool_flush(), and as long as those two are mutually exclusive, nobody will be able to free the pointer while pool_gc() attempts to access it. So change the flush_lock to a spinlock, and don't bother acquire/release it in __pool_get_first(), that way callers of __pool_get_first() won't have to wait while the pool is flushed. The worst that can happen is we call __pool_refill_alloc() while the pool is getting flushed, and memory can get allocated just to be free'd. This may help with github issue #552 This may be backported to 2.1, 2.0 and 1.9.	2020-03-18 15:55:35 +01:00
Olivier Houchard	de01ea9878	MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h. Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into types/signal.h, so that we can access them in another file. We need those definition to avoid blocking those signals when running __signal_process_queue(). This should be backported to 2.1, 2.0 and 1.9.	2020-03-18 13:07:19 +01:00
Olivier Houchard	a7bf573520	MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock. In the struct fdtab, introduce a new mask, running_mask. Each thread should add its bit before using the fd. Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just spin as long as the mask is non-zero, to be sure we access the data exclusively. fd_set_running_excl() spins until the mask is 0, fd_set_running() just adds the thread bit, and fd_clr_running() removes it.	2020-03-17 15:30:07 +01:00
William Lallemand	2954c478eb	MEDIUM: ssl: allow crt-list caching The crtlist structure defines a crt-list in the HAProxy configuration. It contains crtlist_entry structures which are the lines in a crt-list file. crt-list are now loaded in memory using crtlist and crtlist_entry structures. The file is read only once. The generation algorithm changed a little bit, new ckch instances are generated from the crtlist structures, instead of being generated during the file loading. The loading function was split in two, one that loads and caches the crt-list and certificates, and one that looks for a crt-list and creates the ckch instances. Filters are also stored in crtlist_entry->filters as a char ** so we can generate the sni_ctx again if needed. I won't be needed anymore to parse the sni_ctx to do that. A crtlist_entry stores the list of all ckch_inst that were generated from this entry.	2020-03-16 16:18:49 +01:00
Willy Tarreau	e4d42551bd	BUILD: pools: silence build warnings with DEBUG_MEMORY_POOLS and DEBUG_UAF With these debug options we still get these warnings: include/common/memory.h:501:23: warning: null pointer dereference [-Wnull-dereference] (volatile int )0 = 0; ~~~~~~~~~~~~~~~~~~~^~~ include/common/memory.h:460:22: warning: null pointer dereference [-Wnull-dereference] (volatile int )0 = 0; ~~~~~~~~~~~~~~~~~~~^~~ These are purposely there to crash the process at specific locations. But the annoying warnings do not help with debugging and they are not even reliable as the compiler may decide to optimize them away. Let's pass the pointer through DISGUISE() to avoid this.	2020-03-14 11:10:21 +01:00
Willy Tarreau	2e8ab6b560	MINOR: use DISGUISE() everywhere we deliberately want to ignore a result It's more generic and versatile than the previous shut_your_big_mouth_gcc() that was used to silence annoying warnings as it's not limited to ignoring syscalls returns only. This allows us to get rid of the aforementioned function and the shut_your_big_mouth_gcc_int variable, that started to look ugly in multi-threaded environments.	2020-03-14 11:04:49 +01:00
Willy Tarreau	15ed69fd3f	MINOR: debug: consume the write() result in BUG_ON() to silence a warning Tim reported that BUG_ON() issues warnings on his distro, as the libc marks some syscalls with __attribute__((warn_unused_result)). Let's pass the write() result through DISGUISE() to hide it.	2020-03-14 10:58:35 +01:00
Willy Tarreau	f401668306	MINOR: debug: add a new DISGUISE() macro to pass a value as identity This does exactly the same as ALREADY_CHECKED() but does it inline, returning an identical copy of the scalar variable without letting the compiler know how it might have been transformed. This can forcefully disable certain null-pointer checks or result checks when known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0 will not cause a null-deref warning.	2020-03-14 10:52:46 +01:00
Ilya Shipitsin	77e3b4a2c4	CLEANUP: assorted typo fixes in the code and comments These are mostly comments in the code. A few error messages were fixed and are of low enough importance not to deserve a backport. Some regtests were also fixed.	2020-03-14 09:42:07 +01:00
Tim Duesterhus	cf6e0c8a83	MEDIUM: proxy_protocol: Support sending unique IDs using PPv2 This patch adds the `unique-id` option to `proxy-v2-options`. If this option is set a unique ID will be generated based on the `unique-id-format` while sending the proxy protocol v2 header and stored as the unique id for the first stream of the connection. This feature is meant to be used in `tcp` mode. It works on HTTP mode, but might result in inconsistent unique IDs for the first request on a keep-alive connection, because the unique ID for the first stream is generated earlier than the others. Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made available in TCP mode.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	d1b15b6e9b	MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections This patch reads a proxy protocol v2 provided unique ID and makes it available using the `fc_pp_unique_id` fetch.	2020-03-13 17:25:23 +01:00
Tim Duesterhus	b435f77620	DOC: proxy_protocol: Reserve TLV type 0x05 as PP2_TYPE_UNIQUE_ID This reserves and defines TLV type 0x05.	2020-03-13 17:25:23 +01:00
Olivier Houchard	84fd8a77b7	MINOR: lists: fix indentation. Fix indentation in the recently added list_to_mt_list().	2020-03-11 21:41:13 +01:00
Olivier Houchard	8676514d4e	MINOR: servers: Kill priv_conns. Remove the list of private connections from server, it has been largely unused, we only inserted connections in it, but we would never actually use it.	2020-03-11 19:20:01 +01:00
Olivier Houchard	751e5e21a9	MINOR: lists: Implement function to convert list => mt_list and mt_list => list Implement mt_list_to_list() and list_to_mt_list(), to be able to convert from a struct list to a struct mt_list, and vice versa. This is normally of no use, except for struct connection's list field, that can go in either a struct list or a struct mt_list.	2020-03-11 17:10:40 +01:00
Olivier Houchard	49983a9fe1	MINOR: mt_lists: Appease gcc. gcc is confused, and think p may end up being NULL in _MT_LIST_RELINK_DELETED. It should never happen, so let gcc know that.	2020-03-11 17:10:08 +01:00
Willy Tarreau	638698da37	BUILD: stream-int: fix a few includes dependencies The stream-int code doesn't need to load server.h as it doesn't use servers at all. However removing this one reveals that proxy.h was lacking types/checks.h that used to be silently inherited from types/server.h loaded before in stream_interface.h.	2020-03-11 14:15:33 +01:00
Willy Tarreau	855796bdc8	BUG/MAJOR: list: fix invalid element address calculation Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10 (pre-release). It turns out that constructs such as: while (item != head) { item = LIST_ELEM(item.n); } loop forever, never matching <item> to <head> despite a printf there showing them equal. In practice the problem is that the LIST_ELEM() macro is wrong, it assigns the subtract of two pointers (an integer) to another pointer through a cast to its pointer type. And GCC 10 now considers that this cannot match a pointer and silently optimizes the comparison away. A tested workaround for this is to build with -fno-tree-pta. Note that older gcc versions even with -ftree-pta do not exhibit this rather surprizing behavior. This patch changes the test to instead cast the null-based address to an int to get the offset and subtract it from the pointer, and this time it works. There were just a few places to adjust. Ideally offsetof() should be used but the LIST_ELEM() API doesn't make this trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*) thus it would require to completely change the whole API, which is not something workable in the short term, especially for a backport. With this change, the emitted code is subtly different even on older versions. A code size reduction of ~600 bytes and a total executable size reduction of ~1kB are expected to be observed and should not be taken as an anomaly. Typically this loop in dequeue_proxy_listeners() : while ((listener = MT_LIST_POP(...))) used to produce this code where the comparison is performed on RAX while the new offset is assigned to RDI even though both are always identical: 53ded8: 48 8d 78 c0 lea -0x40(%rax),%rdi 53dedc: 48 83 f8 40 cmp $0x40,%rax 53dee0: 74 39 je 53df1b <dequeue_proxy_listeners+0xab> and now produces this one which is slightly more efficient as the same register is used for both purposes: 53dd08: 48 83 ef 40 sub $0x40,%rdi 53dd0c: 74 2d je 53dd3b <dequeue_proxy_listeners+0x9b> Similarly, retrieving the channel from a stream_interface using si_ic() and si_oc() used to cause this (stream-int in rdi): 1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi) 1cbe: f6 47 04 10 testb $0x10,0x4(%rdi) 1cc2: 74 1c je 1ce0 <si_report_error+0x30> 1cc4: 48 81 ef 00 03 00 00 sub $0x300,%rdi 1ccb: 81 4f 10 00 08 00 00 orl $0x800,0x10(%rdi) and now causes this: 1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi) 1cbe: f6 47 04 10 testb $0x10,0x4(%rdi) 1cc2: 74 1c je 1ce0 <si_report_error+0x30> 1cc4: 81 8f 10 fd ff ff 00 orl $0x800,-0x2f0(%rdi) There is extremely little chance that this fix wakes up a dormant bug as the emitted code effectively does what the source code intends. This must be backported to all supported branches (dropping MT_LIST_ELEM and the spoa_example parts as needed), since the bug is subtle and may not always be visible even when compiling with gcc-10.	2020-03-11 14:12:51 +01:00
Olivier Houchard	1d117e3dcd	BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL; In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable instead of using the provided pointer directly, we shouldn't have changed the code that set the pointer to NULL, as we really want the pointer provided to be nullified, otherwise other parts of the code won't know we just deleted an element, and bad things will happen. This should be backported to 2.1.	2020-03-10 17:45:05 +01:00
Willy Tarreau	9a0dfa5298	CLEANUP: remove the now unused common/syscall.h It was added 9 years ago to implement USE_MY_SPLICE on some libcs where syscall() was bogus. It's about time to get rid of this.	2020-03-10 07:28:46 +01:00
Willy Tarreau	06c63aec95	CLEANUP: remove support for USE_MY_SPLICE The splice() syscall has been supported in glibc since version 2.5 issued in 2006 and is present on supported systems so there's no need for having our own arch-specific syscall definitions anymore.	2020-03-10 07:23:41 +01:00
Willy Tarreau	3858b122a6	CLEANUP: remove support for USE_MY_EPOLL This was made to support epoll on patched 2.4 kernels, and on early 2.6 using alternative libcs thanks to the arch-specific syscall definitions. All the features we support have been around since 2.6.2 and present in glibc since 2.3.2, neither of which are found in field anymore. Let's simply drop this and use epoll normally.	2020-03-10 07:08:10 +01:00
Willy Tarreau	618ac6ea52	CLEANUP: drop support for USE_MY_ACCEPT4 The accept4() syscall has been present for a while now, there is no more reason for maintaining our own arch-specific syscall implementation for systems lacking it in libc but having it in the kernel.	2020-03-10 07:02:46 +01:00
Willy Tarreau	c3e926bf3b	CLEANUP: remove support for Linux i686 vsyscalls This was introduced 10 years ago to squeeze a few CPU cycles per syscall on 32-bit x86 machines and was already quite old by then, requiring to explicitly enable support for this in the kernel. We don't even know if it still builds, let alone if it works at all on recent kernels! Let's completely drop this now.	2020-03-10 06:55:52 +01:00
William Lallemand	6763016866	BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"), HAProxy generates a list of filters based on the sni_ctx in memory. However it's not always relevant, sometimes no filters were configured and the CN/SAN in the new certificate are not the same. This patch fixes the issue by using a flag filters in the ckch_inst, so we are able to know if there were filters or not. In the late case it uses the CN/SAN of the new certificate to generate the sni_ctx. note: filters are still only used in the crt-list atm.	2020-03-09 17:32:04 +01:00
William Lallemand	0a52846603	CLEANUP: ssl: is_default is a bit in ckch_inst The field is_default becomes a bit in the ckch_inst structure.	2020-03-09 17:32:04 +01:00
Miroslav Zagorac	d7dc67ba1d	CLEANUP: remove unused code in 'my_ffsl/my_flsl' functions Shifting the variable 'a' one bit to the right has no effect on the result of the functions.	2020-03-09 14:47:27 +01:00
Willy Tarreau	ee3bcddef7	MINOR: tools: add a generic function to generate UUIDs We currently have two UUID generation functions, one for the sample fetch and the other one in the SPOE filter. Both were a bit complicated since they were made to support random() implementations returning an arbitrary number of bits, and were throwing away 33 bits every 64. Now we don't need this anymore, so let's have a generic function consuming 64 bits at once and use it as appropriate.	2020-03-08 18:04:16 +01:00
Willy Tarreau	52bf839394	BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG This is the replacement of failed attempt to add thread safety and per-process sequences of random numbers initally tried with commit `1c306aa84d` ("BUG/MEDIUM: random: implement per-thread and per-process random sequences"). This new version takes a completely different approach and doesn't try to work around the horrible OS-specific and non-portable random API anymore. Instead it implements "xoroshiro128*", a reputedly high quality random number generator, which is one of the many variants of xorshift, which passes all quality tests and which is described here: http://prng.di.unimi.it/ While not cryptographically secure, it is fast and features a 2^128-1 period. It supports fast jumps allowing to cut the period into smaller non-overlapping sequences, which we use here to support up to 2^32 processes each having their own, non-overlapping sequence of 2^96 numbers (~710^28). This is enough to provide 1 billion randoms per second and per process for 2200 billion years. The implementation was made thread-safe either by using a double 64-bit CAS on platforms supporting it (x86_64, aarch64) or by using a local lock for the time needed to perform the shift operations. This ensures that all threads pick numbers from the same pool so that it is not needed to assign per-thread ranges. For processes we use the fast jump method to advance the sequence by 2^96 for each process. Before this patch, the following config: global nbproc 8 frontend f bind :4445 mode http log stdout format raw daemon log-format "%[uuid] %pid" redirect location / Would produce this output: a4d0ad64-2645-4b74-b894-48acce0669af 12987 a4d0ad64-2645-4b74-b894-48acce0669af 12992 a4d0ad64-2645-4b74-b894-48acce0669af 12986 a4d0ad64-2645-4b74-b894-48acce0669af 12988 a4d0ad64-2645-4b74-b894-48acce0669af 12991 a4d0ad64-2645-4b74-b894-48acce0669af 12989 a4d0ad64-2645-4b74-b894-48acce0669af 12990 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986 (...) And now produces: f94b29b3-da74-4e03-a0c5-a532c635bad9 13011 47470c02-4862-4c33-80e7-a952899570e5 13014 86332123-539a-47bf-853f-8c8ea8b2a2b5 13013 8f9efa99-3143-47b2-83cf-d618c8dea711 13012 3cc0f5c7-d790-496b-8d39-bec77647af5b 13015 3ec64915-8f95-4374-9e66-e777dc8791e0 13009 0f9bf894-dcde-408c-b094-6e0bb3255452 13011 49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014 e23f6f2e-35c5-4433-a294-b790ab902653 13012 There are multiple benefits to using this method. First, it doesn't depend anymore on a non-portable API. Second it's thread safe. Third it is fast and more proven than any hack we could attempt to try to work around the deficiencies of the various implementations around. This commit depends on previous patches "MINOR: tools: add 64-bit rotate operators" and "BUG/MEDIUM: random: initialize the random pool a bit better", all of which will need to be backported at least as far as version 2.0. It doesn't require to backport the build fixes for circular include files dependecy anymore.	2020-03-08 10:09:02 +01:00
Willy Tarreau	7a40909c00	MINOR: tools: add 64-bit rotate operators This adds rotl64/rotr64 to rotate a 64-bit word by an arbitrary number of bits. It's mainly aimed at being used with constants.	2020-03-08 00:42:18 +01:00
Willy Tarreau	0fbf28a05b	Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences" This reverts commit `1c306aa84d`. It breaks the build on all non-glibc platforms. I got confused by the man page (which possibly is the most confusing man page I've ever read about a standard libc function) and mistakenly understood that random_r was portable, especially since it appears in latest freebsd source as well but not in released versions, and with a slightly different API :-/ We need to find a different solution with a fallback. Among the possibilities, we may reintroduce this one with a fallback relying on locking around the standard functions, keeping fingers crossed for no other library function to call them in parallel, or we may also provide our own PRNG, which is not necessarily more difficult than working around the totally broken up design of the portable API.	2020-03-07 11:24:39 +01:00
Willy Tarreau	1c306aa84d	BUG/MEDIUM: random: implement per-thread and per-process random sequences As mentioned in previous patch, the random number generator was never made thread-safe, which used not to be a problem for health checks spreading, until the uuid sample fetch function appeared. Currently it is possible for two threads or processes to produce exactly the same UUID. In fact it's extremely likely that this will happen for processes, as can be seen with this config: global nbproc 8 frontend f bind :4445 mode http log stdout daemon format raw log-format "%[uuid] %pid" redirect location / It typically produces this log: 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646 07764439-c24d-4e6f-a5a6-0138be59e7a8 30645 07764439-c24d-4e6f-a5a6-0138be59e7a8 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643 07764439-c24d-4e6f-a5a6-0138be59e7a8 30646 b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642 07764439-c24d-4e6f-a5a6-0138be59e7a8 30642 What this patch does is to use a distinct per-thread and per-process seed to make sure the same sequences will not appear, and will then extend these seeds by "burning" a number of randoms that depends on the global random seed, the thread ID and the process ID. This adds roughly 20 extra bits of randomness, resulting in 52 bits total per thread and per process. It only takes a few milliseconds to burn these randoms and given that threads start with a different seed, we know they will not catch each other. So these random extra bits are essentially added to ensure randomness between boots and cluster instances. This replaces all uses of random() with ha_random() which uses the thread-local state. This must be backported as far as 2.0 or any version having the UUID sample-fetch function since it's the main victim here. It's important to note that this patch, in addition to depending on the previous one "BUG/MEDIUM: init: initialize the random pool a bit better", also depends on the preceeding build fixes to address a circular dependency issue in the include files that prevented it from building. Part or all of these patches may need to be backported or adapted as well.	2020-03-07 06:11:15 +01:00
Willy Tarreau	6c3a681bd6	BUG/MEDIUM: random: initialize the random pool a bit better Since the UUID sample fetch was created, some people noticed that in certain virtualized environments they manage to get exact same UUIDs on different instances started exactly at the same moment. It turns out that the randoms were only initialized to spread the health checks originally, not to provide "clean" randoms. This patch changes this and collects more randomness from various sources, including existing randoms, /dev/urandom when available, RAND_bytes() when OpenSSL is available, as well as the timing for such operations, then applies a SHA1 on all this to keep a 160 bits random seed available, 32 of which are passed to srandom(). It's worth mentioning that there's no clean way to pass more than 32 bits to srandom() as even initstate() provides an opaque state that must absolutely not be tampered with since known implementations contain state information. At least this allows to have up to 4 billion different sequences from the boot, which is not that bad. Note that the thread safety was still not addressed, which is another issue for another patch. This must be backported to all versions containing the UUID sample fetch function, i.e. as far as 2.0.	2020-03-07 06:11:11 +01:00
Willy Tarreau	5a421a8f49	BUILD: listener: types/listener.h must not include standard.h It's only a type definition, this header is not needed and causes some circular dependency issues.	2020-03-07 06:07:18 +01:00
Willy Tarreau	c7f64e7a58	BUILD: freq_ctr: proto/freq_ctr needs to include common/standard.h This is needed for div_64_32() which is there and currently accidently inherited via global.h!	2020-03-07 06:07:18 +01:00
Willy Tarreau	f23e029409	BUILD: global: must not include common/standard.h but only types/freq_ctr.h This one was accidently inherited and used to work but causes a circular dependency.	2020-03-07 06:07:18 +01:00
Willy Tarreau	8dd0d55efe	BUILD: ssl: include mini-clist.h We use some list definitions and we don't include this header which is in fact accidently inherited from others, causing a circular dependency issue.	2020-03-07 06:07:18 +01:00
Willy Tarreau	a8561db936	BUILD: buffer: types/{ring.h,checks.h} should include buf.h, not buffer.h buffer.h relies on proto/activity because it contains some code and not just type definitions. It must not be included from types files. It should probably also be split in two if it starts to include a proto. This causes some circular dependencies at other places.	2020-03-07 06:07:18 +01:00
Christopher Faulet	d8f0e073dd	MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY This flag was used in some internal functions to be sure the current stream is able to handle HTTP content. It was introduced when the legacy HTTP code was still there. Now, It is possible to rely on stream's flags to be sure we have an HTX stream. So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is replaced by a call to the IS_HTX_STRM() macro. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	1cdceb9365	MINOR: htx: Add a function to return a block at a specific offset The htx_find_offset() function may be used to look for a block at a specific offset in an HTX message, starting from the message head. A compound result is returned, an htx_ret structure, with the found block and the position of the offset in the block. If the offset is ouside of the HTX message, the returned block is NULL.	2020-03-06 14:12:59 +01:00
Christopher Faulet	251f4917c3	MINOR: buf: Add function to insert a string at an absolute offset in a buffer The b_insert_blk() function may now be used to insert a string, given a pointer and the string length, at an absolute offset in a buffer, moving data between this offset and the buffer's tail just after the end of the inserted string. The buffer's length is automatically updated. This function supports wrapping. All the string is copied or nothing. So it returns 0 if there are not enough space to perform the copy. Otherwise, the number of bytes copied is returned.	2020-03-06 14:12:59 +01:00
Carl Henrik Lunde	f91ac19299	OPTIM: startup: fast unique_id allocation for acl. pattern_finalize_config() uses an inefficient algorithm which is a problem with very large configuration files. This affects startup, and therefore reload time. When haproxy is deployed as a router in a Kubernetes cluster the generated configuration file may be large and reloads are frequently occuring, which makes this a significant issue. The old algorithm is O(n^2) * allocate missing uids - O(n^2) * sort linked list - O(n^2) The new algorithm is O(n log n): * find the user allocated uids - O(n) * store them for efficient lookup - O(n log n) * allocate missing uids - n times O(log n) * sort all uids - O(n log n) * convert back to linked list - O(n) Performance examples, startup time in seconds: pat_refs old new 1000 0.02 0.01 10000 2.1 0.04 20000 12.3 0.07 30000 27.9 0.10 40000 52.5 0.14 50000 77.5 0.17 Please backport to 1.8, 2.0 and 2.1.	2020-03-06 08:11:58 +01:00
Tim Duesterhus	a17e66289c	MEDIUM: stream: Make the `unique_id` member of `struct stream` a `struct ist` The `unique_id` member of `struct stream` now is a `struct ist`.	2020-03-05 20:21:58 +01:00
Tim Duesterhus	0643b0e7e6	MINOR: proxy: Make `header_unique_id` a `struct ist` The `header_unique_id` member of `struct proxy` now is a `struct ist`.	2020-03-05 19:58:22 +01:00
Tim Duesterhus	9576ab7640	MINOR: ist: Add `struct ist istdup(const struct ist)` istdup() performs the equivalent of strdup() on a `struct ist`.	2020-03-05 19:53:12 +01:00
Tim Duesterhus	35005d01d2	MINOR: ist: Add `struct ist istalloc(size_t)` and `void istfree(struct ist*)` `istalloc` allocates memory and returns an `ist` with the size `0` that points to this allocation. `istfree` frees the pointed memory and clears the pointer.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	e296d3e5f0	MINOR: ist: Add `int isttest(const struct ist)` `isttest` returns whether the `.ptr` is non-null.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	241e29ef9c	MINOR: ist: Add `IST_NULL` macro `IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and `.len = 0`.	2020-03-05 19:52:07 +01:00
William Lallemand	cfca1422c7	MINOR: ssl: reach a ckch_store from a sni_ctx It was only possible to go down from the ckch_store to the sni_ctx but not to go up from the sni_ctx to the ckch_store. To allow that, 2 pointers were added: - a ckch_inst pointer in the struct sni_ctx - a ckckh_store pointer in the struct ckch_inst	2020-03-05 11:28:42 +01:00
William Lallemand	38df1c8006	MINOR: ssl/cli: support crt-list filters Generate a list of the previous filters when updating a certificate which use filters in crt-list. Then pass this list to the function generating the sni_ctx during the commit. This feature allows the update of the crt-list certificates which uses the filters with "set ssl cert". This function could be probably replaced by creating a new ckch_inst_new_load_store() function which take the previous sni_ctx list as an argument instead of the char **sni_filter, avoiding the allocation/copy during runtime for each filter. But since are still handling the multi-cert bundles, it's better this way to avoid code duplication.	2020-03-05 11:27:53 +01:00
Tim Duesterhus	127a74dd48	MINOR: stream: Add stream_generate_unique_id function Currently unique IDs for a stream are generated using repetitive code in multiple locations, possibly allowing for inconsistent behavior.	2020-03-05 07:23:00 +01:00
Willy Tarreau	899e5f69a1	MINOR: debug: use our own backtrace function on clang+x86_64 A test on FreeBSD with clang 4 to 8 produces this on a call to a spinning loop on the CLI: call trace(5): \| 0x53e2bc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c \| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e with our own function it correctly produces this: call trace(20): \| 0x53e2dc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c \| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e \| 0x800e022bf [48 83 c4 38 5b 41 5c 41]: libthr:pthread_getspecific+0xdef \| 0x7ffffffff003 [48 8d 7c 24 10 6a 00 48]: main+0x7fffffb416f3 \| 0x801373809 [85 c0 0f 84 6f ff ff ff]: libc:__sys_gettimeofday+0x199 \| 0x801373709 [89 c3 85 c0 75 a6 48 8b]: libc:__sys_gettimeofday+0x99 \| 0x801371c62 [83 f8 4e 75 0f 48 89 df]: libc:gettimeofday+0x12 \| 0x51fa0a [48 89 df 4c 89 f6 e8 6b]: ha_thread_dump_all_to_trash+0x49a \| 0x4b723b [85 c0 75 09 49 8b 04 24]: mworker_cli_sockpair_new+0xd9b \| 0x4b6c68 [85 c0 75 08 4c 89 ef e8]: mworker_cli_sockpair_new+0x7c8 \| 0x532f81 [4c 89 e7 48 83 ef 80 41]: task_run_applet+0xe1 So let's add clang+x86_64 to the list of platforms that will use our simplified version. As a bonus it will not require to link with -lexecinfo on FreeBSD and will work out of the box when passing USE_BACKTRACE=1.	2020-03-04 12:04:07 +01:00
Willy Tarreau	13faf16e1e	MINOR: debug: improve backtrace() on aarch64 and possibly other systems It happens that on aarch64 backtrace() only returns one entry (tested with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding the stack due to the risk of hitting a bad pointer. Here we can use may_access() to know when it's safe, so we can actually unwind the stack without taking risks. It happens that the faulting function (the one just after the signal handler) is not listed here, very likely because the signal handler uses a special stack and did not create a new frame. So this patch creates a new my_backtrace() function in standard.h that either calls backtrace() or does its own unrolling. The choice depends on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build target.	2020-03-04 12:04:07 +01:00
Emmanuel Hocdet	842e94ee06	MINOR: ssl: add "ca-verify-file" directive It's only available for bind line. "ca-verify-file" allows to separate CA certificates from "ca-file". CA names sent in server hello message is only compute from "ca-file". Typically, "ca-file" must be defined with intermediate certificates and "ca-verify-file" with certificates to ending the chain, like root CA. Fix issue #404.	2020-03-04 11:53:11 +01:00
Willy Tarreau	eb8b1ca3eb	MINOR: tools: add resolve_sym_name() to resolve function pointers We use various hacks at a few places to try to identify known function pointers in debugging outputs (show threads & show fd). Let's centralize this into a new function dedicated to this. It already knows about the functions matched by "show threads" and "show fd", and when built with USE_DL, it can rely on dladdr1() to resolve other functions. There are some limitations, as static functions are not resolved, linking with -rdynamic is mandatory, and even then some functions will not necessarily appear. It's possible to do a better job by rebuilding the whole symbol table from the ELF headers in memory but it's less portable and the gains are still limited, so this solution remains a reasonable tradeoff.	2020-03-03 18:18:40 +01:00
Willy Tarreau	762fb3ec8e	MINOR: tools: add new function dump_addr_and_bytes() This function dumps <n> bytes from <addr> in hex form into buffer <buf> enclosed in brackets after the address itself, formatted on 14 chars including the "0x" prefix. This is meant to be used as a prefix for code areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: " It relies on may_access() to know if the bytes are dumpable, otherwise "--" is emitted. An optional prefix is supported.	2020-03-03 17:46:37 +01:00
Willy Tarreau	27d00c0167	MINOR: task: export run_tasks_from_list This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	3ebd55ee51	MINOR: haproxy: export run_poll_loop This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1827845a3d	MINOR: haproxy: export main to ease access from debugger Better just export main instead of declaring it as extern, it's cleaner and may be usable elsewhere.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1ed3781e21	MINOR: fd: merge the read and write error bits into RW error We always set them both, which makes sense since errors at the FD level indicate a terminal condition for the socket that cannot be recovered. Usually this is detected via a write error, but sometimes such an error may asynchronously be reported on the read side. Let's simplify this using only the write bit and calling it RW since it's used like this everywhere, and leave the R bit spare for future use.	2020-02-28 07:42:29 +01:00
Willy Tarreau	a135ea63a6	CLEANUP: fd: remove some unneeded definitions of FD_EV_* flags There's no point in trying to be too generic for these flags as the read and write sides will soon differ a bit. Better explicitly define the flags for each direction without trying to be direction-agnostic. this clarifies the code and removes some defines.	2020-02-28 07:42:29 +01:00
Willy Tarreau	f80fe832b1	CLEANUP: fd: remove the FD_EV_STATUS aggregate This was used only by fd_recv_state() and fd_send_state(), both of which are unused. This will not work anymore once recv and send flags start to differ, so let's remove this.	2020-02-28 07:42:29 +01:00
Jerome Magnin	967d3cc105	BUG/MINOR: http_ana: make sure redirect flags don't have overlapping bits commit `c87e46881` ("MINOR: http-rules: Add a flag on redirect rules to know the rule direction") introduced a new flag for redirect rules, but its value has bits in common with REDIRECT_FLAG_DROP_QS, which makes us enter this code path in http_apply_redirect_rule(), which will then drop the query string. To fix this, just give REDIRECT_FLAG_FROM_REQ its own unique value. This must be backported where `c87e468816` is backported. This should fix issue 521.	2020-02-27 23:44:41 +01:00
Willy Tarreau	2104659cd5	MEDIUM: buffer: remove the buffer_wq lock This lock was only needed to protect the buffer_wq list, but now we have the mt_list for this. This patch simply turns the buffer_wq list to an mt_list and gets rid of the lock. It's worth noting that the whole buffer_wait thing still looks totally wrong especially in a threaded context: the wakeup_cb() callback is called synchronously from any thread and may end up calling some connection code that was not expected to run on a given thread. The whole thing should probably be reworked to use tasklets instead and be a bit more centralized.	2020-02-26 10:39:36 +01:00
William Lallemand	e0f3fd5b4c	CLEANUP: ssl: move issuer_chain tree and definition Move the cert_issuer_tree outside the global_ssl structure since it's not a configuration variable. And move the declaration of the issuer_chain structure in types/ssl_sock.h	2020-02-25 15:06:40 +01:00
Willy Tarreau	226ef26056	MINOR: compiler: add new alignment macros This commit adds ALWAYS_ALIGN(), MAYBE_ALIGN() and ATOMIC_ALIGN() to be placed as delimitors inside structures to force alignment to a given size. These depend on the architecture's capabilities so that it is possible to always align, align only on archs not supporting unaligned accesses at all, or only on those not supporting them for atomic accesses (e.g. before a lock).	2020-02-25 10:34:43 +01:00
Willy Tarreau	908071171b	BUILD: general: always pass unsigned chars to is* functions The isalnum(), isalpha(), isdigit() etc functions from ctype.h are supposed to take an int in argument which must either reflect an unsigned char or EOF. In practice on some platforms they're implemented as macros referencing an array, and when passed a char, they either cause a warning "array subscript has type 'char'" when lucky, or cause random segfaults when unlucky. It's quite unconvenient by the way since none of them may return true for negative values. The recent introduction of cygwin to the list of regularly tested build platforms revealed a lot of breakage there due to the same issues again. So this patch addresses the problem all over the code at once. It adds unsigned char casts to every valid use case, and also drops the unneeded double cast to int that was sometimes added on top of it. It may be backported by dropping irrelevant changes if that helps better support uncommon platforms. It's unlikely to fix bugs on platforms which would already not emit any warning though.	2020-02-25 08:16:33 +01:00
Willy Tarreau	03e7853581	BUILD: remove obsolete support for -mregparm / USE_REGPARM This used to be a minor optimization on ix86 where registers are scarce and the calling convention not very efficient, but this platform is not relevant enough anymore to warrant all this dirt in the code for the sake of saving 1 or 2% of performance. Modern platforms don't use this at all since their calling convention already defaults to using several registers so better get rid of this once for all.	2020-02-25 07:41:47 +01:00
Tim Duesterhus	1d48ba91d7	CLEANUP: net_helper: Do not negate the result of unlikely This patch turns the double negation of 'not unlikely' into 'likely' and then turns the negation of 'not smaller' into 'greater or equal' in an attempt to improve readability of the condition. [wt: this was not a bug but purposely written like this to improve code generation on older compilers but not needed anymore as described here: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]	2020-02-25 07:30:49 +01:00
Tim Duesterhus	927063b892	CLEANUP: conn: Do not pass a pointer to likely Move the `!` inside the likely and negate it to unlikely. The previous version should not have caused issues, because it is converted to a boolean / integral value before being passed to __builtin_expect(), but it's certainly unusual. [wt: this was not a bug but purposely written like this to improve code generation on older compilers but not needed anymore as described here: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]	2020-02-25 07:30:49 +01:00
Willy Tarreau	89ee79845c	MINOR: compiler: drop special cases of likely/unlikely for older compilers We used to special-case the likely()/unlikely() macros for a series of early gcc 4.x compilers which used to produce very bad code when using __builtin_expect(x,1), which basically used to build an integer (0 or 1) from a condition then compare it to integer 1. This was already fixed in 5.x, but even now, looking at the code produced by various flavors of 4.x this bad behavior couldn't be witnessed anymore. So let's consider it as fixed by now, which will allow to get rid of some ugly tricks at some specific places. A test on 4.7.4 shows that the code shrinks by about 3kB now, thanks to some tests being inlined closer to the call place and the unlikely case being moved to real functions. See the link below for more background on this. Link: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html	2020-02-25 07:29:55 +01:00
Willy Tarreau	0e2686762f	MINOR: compiler: move CPU capabilities definition from config.h and complete them These ones are irrelevant to the config but rather to the platform, and as such are better placed in compiler.h. Here we take the opportunity for declaring a few extra capabilities: - HA_UNALIGNED : CPU supports unaligned accesses - HA_UNALIGNED_LE : CPU supports unaligned accesses in little endian - HA_UNALIGNED_FAST : CPU supports fast unaligned accesses - HA_UNALIGNED_ATOMIC : CPU supports unaligned accesses in atomics This will help remove a number of #ifdefs with arch-specific statements.	2020-02-21 16:32:57 +01:00
Jerome Magnin	9dde0b2d31	MINOR: ist: add an iststop() function Add a function that finds a character in an ist and returns an updated ist with the length of the portion of the original string that doesn't contain the char. Might be backported to 2.1	2020-02-21 11:47:25 +01:00
Willy Tarreau	716bec2dc6	MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE This flag is currently supported by raw_sock to perform a single recv() attempt and avoid subscribing. Typically on the request and response paths with keep-alive, with short messages we know that it's very likely that the first message is enough.	2020-02-21 11:22:45 +01:00
Willy Tarreau	5d4d1806db	CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv} This marks the end of the transition from the connection polling states introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9. The socket layer can now safely use its FD while all upper layers rely exclusively on subscriptions. These old functions were removed. Some may deserve some renaming to improved clarty though. The single call to conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling() which already does the same.	2020-02-21 11:21:12 +01:00
Willy Tarreau	d1d14c3157	MINOR: connection: remove the last calls to conn_xprt_{want,stop}_* The last few calls to conn_xprt_{want,stop}_{recv,send} in the central connection code were replaced with their strictly exact equivalent fd_*, adding the call to conn_ctrl_ready() when it was missing.	2020-02-21 11:21:12 +01:00
Willy Tarreau	19bc201c9f	MEDIUM: connection: remove the intermediary polling state from the connection Historically we used to require that the connections held the desired polling states for the data layer and the socket layer. Then with muxes these were more or less merged into the transport layer, and now it happens that with all transport layers having their own state, the "transport layer state" as we have it in the connection (XPRT_RD_ENA, XPRT_WR_ENA) is only an exact copy of the undelying file descriptor state, but with a delay. All of this is causing some difficulties at many places in the code because there are still some locations which use the conn_want_* API to remain clean and only rely on connection, and count on a later collection call to conn_cond_update_polling(), while others need an immediate action and directly use the FD updates. Since our updates are now much cheaper, most of them being only an atomic test-and-set operation, and since our I/O callbacks are deferred, there's no benefit anymore in trying to "cache" the transient state change in the connection flags hoping to cancel them before they become an FD event. Better make such calls transparent indirections to the FD layer instead and get rid of the deferred operations which needlessly complicate the logic inside. This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE. A number of functions related to polling updates were either greatly simplified or removed. Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data were expected to be sent after a PROXY protocol or SOCKSv4 header. These ones were simply replaced with a check on the subscription which is where we ought to get the autoritative information from. Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts are the same. conn_stop_polling() and conn_xprt_stop_both() are the same as well. conn_cond_update_polling() only causes errors to stop polling. It also becomes way more obvious that muxes should not at all employ conn_xprt_{want\|stop}_{recv,send}(), and that the call to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer is inappropriate, it ought to unsubscribe from reads instead. All of this definitely requires a serious cleanup.	2020-02-21 11:21:12 +01:00
Christopher Faulet	727a3f1ca3	MINOR: http-htx: Add a function to retrieve the headers size of an HTX message http_get_hdrs_size() function may now be used to get the bytes held by headers in an HTX message. It only works if the headers were not already forwarded. Metadata are not counted here.	2020-02-18 11:19:57 +01:00
Willy Tarreau	a71667c07d	BUG/MINOR: tools: also accept '+' as a valid character in an identifier The function is_idchar() was added by commit `36f586b` ("MINOR: tools: add is_idchar() to tell if a char may belong to an identifier") to ease matching of sample fetch/converter names. But it lacked support for the '+' character used in "base32+src" and "url32+src". A quick way to figure the list of supported sample fetch+converter names is to issue the following command: git grep '"[^"]",.SMP_T_.*SMP_USE_'\|cut -f2 -d'"'\|sort -u No more entry is reported once searching for characters not covered by is_idchar(). No backport is needed.	2020-02-17 06:37:40 +01:00
Willy Tarreau	e3b57bf92f	MINOR: sample: make sample_parse_expr() able to return an end pointer When an end pointer is passed, instead of complaining that a comma is missing after a keyword, sample_parse_expr() will silently return the pointer to the current location into this return pointer so that the caller can continue its parsing. This will be used by more complex expressions which embed sample expressions, and may even permit to embed sample expressions into arguments of other expressions.	2020-02-14 19:02:06 +01:00
Willy Tarreau	80b53ffb1c	MEDIUM: arg: make make_arg_list() stop after its own arguments The main problem we're having with argument parsing is that at the moment the caller looks for the first character looking like an end of arguments (')') and calls make_arg_list() on the sub-string inside the parenthesis. Let's first change the way it works so that make_arg_list() also consumes the parenthesis and returns the pointer to the first char not consumed. This will later permit to refine each argument parsing. For now there is no functional change.	2020-02-14 19:02:06 +01:00
Willy Tarreau	d4ad669051	MINOR: chunk: implement chunk_strncpy() to copy partial strings This does like chunk_strcpy() except that the maximum string length may be limited by the caller. A trailing zero is always appended. This is particularly handy to extract portions of strings to put into the trash for use with libc functions requiring a nul-terminated string.	2020-02-14 19:02:06 +01:00
Willy Tarreau	36f586b694	MINOR: tools: add is_idchar() to tell if a char may belong to an identifier This function will simply be used to find the end of config identifiers (proxies, servers, ACLs, sample fetches, converters, etc).	2020-02-14 19:02:06 +01:00
Ilya Shipitsin	88a2f0304c	CLEANUP: ssl: remove unused functions in openssl-compat.h functions SSL_SESSION_get0_id_context, SSL_CTX_get_default_passwd_cb, SSL_CTX_get_default_passwd_cb_userdata are not used anymore	2020-02-14 16:15:00 +01:00
Willy Tarreau	160ad9e38a	CLEANUP: mini-clist: simplify nested do { while(1) {} } while (0) While looking for other occurrences of do { continue; } while (0) I found these few leftovers in mini-clist where an outer loop was made around "do { } while (0)" then another loop was placed inside just to handle the continue. Let's clean this up by just removing the outer one. Most of the patch is only the inner part of the loop that is reindented. It was verified that the resulting code is the same.	2020-02-11 10:27:04 +01:00
Christopher Faulet	7716cdf450	MINOR: lua: Get the action return code on the stack when an action finishes When an action successfully finishes, the action return code (ACT_RET_*) is now retrieve on the stack, ff the first element is an integer. In addition, in hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before exiting. Thus, when a script uses this function, the corresponding action still finishes with the good code. Thanks to this change, the flag HLUA_STOP is now useless. So it has been removed. It is a mandatory step to allow a lua action to return any action return code.	2020-02-06 15:13:03 +01:00
Christopher Faulet	07a718e712	CLEANUP: lua: Remove consistency check for sample fetches and actions It is not possible anymore to alter the HTTP parser state from lua sample fetches or lua actions. So there is no reason to still check for the parser state consistency.	2020-02-06 15:13:03 +01:00
Christopher Faulet	4a2c142779	MEDIUM: http-rules: Support extra headers for HTTP return actions It is now possible to append extra headers to the generated responses by HTTP return actions, while it is not based on an errorfile. For return actions based on errorfiles, these extra headers are ignored. To define an extra header, a "hdr" argument must be used with a name and a value. The value is a log-format string. For instance: http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"	2020-02-06 15:13:03 +01:00
Christopher Faulet	24231ab61f	MEDIUM: http-rules: Add the return action to HTTP rules Thanks to this new action, it is now possible to return any responses from HAProxy, with any status code, based on an errorfile, a file or a string. Unlike the other internal messages generated by HAProxy, these ones are not interpreted as errors. And it is not necessary to use a file containing a full HTTP response, although it is still possible. In addition, using a log-format string or a log-format file, it is possible to have responses with a dynamic content. This action can be used on the request path or the response path. The only constraint is to have a responses smaller than a buffer. And to avoid any warning the buffer space reserved to the headers rewritting should also be free. When a response is returned with a file or a string as payload, it only contains the content-length header and the content-type header, if applicable. Here are examples: http-request return content-type image/x-icon file /var/www/favicon.ico \ if { path /favicon.ico } http-request return status 403 content-type text/plain \ lf-string "Access denied. IP %[src] is blacklisted." \ if { src -f /etc/haproxy/blacklist.lst }	2020-02-06 15:12:54 +01:00
Christopher Faulet	6d0c3dfac6	MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding This patch introduces the 'http-after-response' rules. These rules are evaluated at the end of the response analysis, just before the data forwarding, on ALL HTTP responses, the server ones but also all responses generated by HAProxy. Thanks to this ruleset, it is now possible for instance to add some headers to the responses generated by the stats applet. Following actions are supported : * allow * add-header * del-header * replace-header * replace-value * set-header * set-status * set-var * strict-mode * unset-var	2020-02-06 14:55:34 +01:00
Christopher Faulet	ef70e25035	MINOR: http-ana: Add a function for forward internal responses Operations performed when internal responses (redirect/deny/auth/errors) are returned are always the same. The http_forward_proxy_resp() function is added to group all of them under a unique function.	2020-02-06 14:55:34 +01:00
Christopher Faulet	72c7d8d040	MINOR: http-ana: Rely on http_reply_and_close() to handle server error The http_server_error() function now relies on http_reply_and_close(). Both do almost the same actions. In addtion, http_server_error() sets the error flag and the final state flag on the stream.	2020-02-06 14:55:34 +01:00
Christopher Faulet	c87e468816	MINOR: http-rules: Add a flag on redirect rules to know the rule direction HTTP redirect rules can be evaluated on the request or the response path. So when a redirect rule is evaluated, it is important to have this information because some specific processing may be performed depending on the direction. So the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the redirect rule during the parsing. This patch is mandatory to fix a bug on redirect rule. It must be backported to all stable versions.	2020-02-06 14:55:34 +01:00
Christopher Faulet	a4168434a7	MINOR: dns: Dynamically allocate dns options to reduce the act_rule size <.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated when a do-resolve rule is parsed. This drastically reduces the structure size.	2020-02-06 14:55:34 +01:00
Christopher Faulet	7651362e52	MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer The channel_htx_copy_msg() function can now be used to copy an HTX message in a channel's buffer. This function takes care to not overwrite existing data. This patch depends on the commit "MINOR: htx: Add a function to append an HTX message to another one". Both are mandatory to fix a bug in http_reply_and_close() function. Be careful to backport both first.	2020-02-06 14:55:16 +01:00
Christopher Faulet	0ea0c86753	MINOR: htx: Add a function to append an HTX message to another one the htx_append_msg() function can now be used to append an HTX message to another one. All the message is copied or nothing. If an error occurs during the copy, all changes are rolled back. This patch is mandatory to fix a bug in http_reply_and_close() function. Be careful to backport it first.	2020-02-06 14:54:47 +01:00
Olivier Houchard	1c7c0d6b97	BUG/MAJOR: memory: Don't forget to unlock the rwlock if the pool is empty. In __pool_get_first(), don't forget to unlock the pool lock if the pool is empty, otherwise no writer will be able to take the lock, and as it is done when reloading, it leads to an infinite loop on reload. This should be backported with commit `04f5fe87d3`	2020-02-03 13:05:31 +01:00
Olivier Houchard	04f5fe87d3	BUG/MEDIUM: memory: Add a rwlock before freeing memory. When using lockless pools, add a new rwlock, flush_pool. read-lock it when getting memory from the pool, so that concurrenct access are still authorized, but write-lock it when we're about to free memory, in pool_flush() and pool_gc(). The problem is, when removing an item from the pool, we unreference it to get the next one, however, that pointer may have been free'd in the meanwhile, and that could provoke a crash if the pointer has been unmapped. It should be OK to use a rwlock, as normal operations will still be able to access the pool concurrently, and calls to pool_flush() and pool_gc() should be pretty rare. This should be backported to 2.1, 2.0 and 1.9.	2020-02-01 18:08:34 +01:00
Willy Tarreau	b30a153cd1	MINOR: task: detect self-wakeups on tl==sched->current instead of TASK_RUNNING This is exactly what we want to detect (a task/tasklet waking itself), so let's use the proper condition for this.	2020-01-31 17:45:10 +01:00
Willy Tarreau	bb238834da	MINOR: task: permanently flag tasklets waking themselves up Commit `a17664d829` ("MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet") tried to inflict a penalty to self-requeuing tasks/tasklets which correspond to those involved in large, high-latency data transfers, for the benefit of all other processing which requires a low latency. However, it turns out that while it ought to do this on a case-by-case basis, basing itself on the RUNNING flag isn't accurate because this flag doesn't leave for tasklets, so we'd rather need a distinct flag to tag such tasklets. This commit introduces TASK_SELF_WAKING to mark tasklets acting like this. For now it's still set when TASK_RUNNING is present but this will have to change. The flag is kept across wakeups.	2020-01-31 17:45:10 +01:00
Willy Tarreau	a17664d829	MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet When a tasklet re-runs itself such as in this chain: si_cs_io_cb -> si_cs_process -> si_notify -> si_chk_rcv then we know it can easily clobber the run queue and harm latency. Now what the scheduler does when it detects this is that such a tasklet is automatically placed into the bulk list so that it's processed with the remaining CPU bandwidth only. Thanks to this the CLI becomes instantly responsive again even under heavy stress at 50 Gbps over 40kcon and 100% CPU on 16 threads.	2020-01-30 19:03:31 +01:00
Willy Tarreau	a62917b890	MEDIUM: tasks: implement 3 different tasklet classes with their own queues We used to mix high latency tasks and low latency tasklets in the same list, and to even refill bulk tasklets there, causing some unfairness in certain situations (e.g. poll-less transfers between many connections saturating the machine with similarly-sized in and out network interfaces). This patch changes the mechanism to split the load into 3 lists depending on the task/tasklet's desired classes : - URGENT: this is mainly for tasklets used as deferred callbacks - NORMAL: this is for regular tasks - BULK: this is for bulk tasks/tasklets Arbitrary ratios of max_processed are picked from each of these lists in turn, with the ability to complete in one list from what was not picked in the previous one. After some quick tests, the following setup gave apparently good results both for raw TCP with splicing and for H2-to-H1 request rate: - 0 to 75% for urgent - 12 to 50% for normal - 12 to what remains for bulk Bulk is not used yet.	2020-01-30 18:59:33 +01:00
Willy Tarreau	911db9bd29	MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE As mentioned in commit `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of consistency on which flags are checked among L4/L6/HANDSHAKE depending on the code areas. A number of sample fetch functions only check for L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and many check both L4L6 and HANDSHAKE. This patch starts to make all of this more consistent by introducing a new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and reports whether the transport layer is ready or not. All inconsistent call places were updated to rely on this one each time the goal was to check for the readiness of the transport layer.	2020-01-23 16:34:26 +01:00
Willy Tarreau	4450b587dd	MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE Most places continue to check CO_FL_HANDSHAKE while in fact they should check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one dedicated to SSL renegotiation. In fact the SSL layer should be the only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data when a renegotiation is in progress, but other ones randomly include it without knowing. And ideally it should even be an internal flag that's not exposed in the connection. This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL. In order to limit the confusion that has accumulated over time, the CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake, possibly used by a renegotiation was moved after the other ones.	2020-01-23 16:34:26 +01:00
Willy Tarreau	c192b0ab95	MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_* Commit `477902bd2e` ("MEDIUM: connections: Get ride of the xprt_done callback.") broke the master CLI for a very obscure reason. It happens that short requests immediately terminated by a shutdown are properly received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not yet set on the connection since we've not passed again through conn_fd_handler() and it was not done in conn_complete_session(). While commit `a8a415d31a` ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session()") fixed the issue, such accident may happen again as the root cause is deeper and actually comes down to the fact that CO_FL_CONNECTED is lazily set at various check points in the code but not every time we drop one wait bit. It is not the first time we face this situation. Originally this flag was used to detect the transition between WAIT_* and CONNECTED in order to call ->wake() from the FD handler. But since at least 1.8-dev1 with commit `7bf3fa3c23` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is always synchronized against the two others before being checked. Moreover, with the I/Os moved to tasklets, the decision to call the ->wake() function is performed after the I/Os in si_cs_process() and equivalent, which don't care about this transition either. So in essence, checking for CO_FL_CONNECTED has become a lazy wait to check for (CO_FL_WAIT_L4_CONN \| CO_FL_WAIT_L6_CONN), but that always relies on someone else having synchronized it. This patch addresses it once for all by killing this flag and only checking the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This revealed a number of inconsistencies that were purposely not addressed here for the sake of bisectability: - while most places do check both L4+L6 and HANDSHAKE at the same time, some places like assign_server() or back_handle_st_con() and a few sample fetches looking for proxy protocol do check for L4+L6 but don't care about HANDSHAKE ; these ones will probably fail on TCP request session rules if the handshake is not complete. - some handshake handlers do validate that a connection is established at L4 but didn't clear CO_FL_WAIT_L4_CONN - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6 before declaring the mux ready while the snd_buf function also checks for the handshake's completion. Likely the former should validate the handshake as well and we should get rid of these extra tests in snd_buf. - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only later clear CO_FL_WAIT_L4_CONN. - xprt_handshake would set CO_FL_CONNECTED itself without actually clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if waiting for a pure Rx handshake. - most places in ssl_sock that were checking CO_FL_CONNECTED don't need to include the L4 check as an L6 check is enough to decide whether to wait for more info or not. It also becomes obvious when reading the test in si_cs_recv() that caused the failure mentioned above that once converted it doesn't make any sense anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete cannot happen since for CS_FL_EOS to be set, the other ones must have been validated. Some of these parts will still deserve further cleanup, and some of the observations above may induce some backports of potential bug fixes once totally analyzed in their context. The risk of breaking existing stuff is too high to blindly backport everything.	2020-01-23 14:41:37 +01:00
Olivier Houchard	477902bd2e	MEDIUM: connections: Get ride of the xprt_done callback. The xprt_done_cb callback was used to defer some connection initialization until we're connected and the handshake are done. As it mostly consists of creating the mux, instead of using the callback, introduce a conn_create_mux() function, that will just call conn_complete_session() for frontend, and create the mux for backend. In h2_wake(), make sure we call the wake method of the stream_interface, as we no longer wakeup the stream task.	2020-01-22 18:56:05 +01:00
Olivier Houchard	8af03b396a	MEDIUM: streams: Always create a conn_stream in connect_server(). In connect_server(), when creating a new connection for which we don't yet know the mux (because it'll be decided by the ALPN), instead of associating the connection to the stream_interface, always create a conn_stream. This way, we have less special-casing needed. Store the conn_stream in conn->ctx, so that we can reach the upper layers if needed.	2020-01-22 18:55:59 +01:00
Emmanuel Hocdet	6b5b44e10f	BUG/MINOR: ssl: ssl_sock_load_pem_into_ckch is not consistent "set ssl cert <filename> <payload>" CLI command should have the same result as reload HAproxy with the updated pem file (<filename>). Is not the case, DHparams/cert-chain is kept from the previous context if no DHparams/cert-chain is set in the context (<payload>). This patch should be backport to 2.1	2020-01-22 15:55:55 +01:00
Adis Nezirovic	1a693fc2fd	MEDIUM: cli: Allow multiple filter entries for "show table" For complex stick tables with many entries/columns, it can be beneficial to filter using multiple criteria. The maximum number of filter entries can be controlled by defining STKTABLE_FILTER_LEN during build time. This patch can be backported to older releases.	2020-01-22 14:33:17 +01:00
Ilya Shipitsin	056c629531	BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x while working on issue #429, I encountered build failures with various non-released openssl versions, let us improve ssl defines, switch to features, not versions, for EVP_CTRL_AEAD_SET_IVLEN and EVP_CTRL_AEAD_SET_TAG. No backport is needed as there is no valid reason to build a stable haproxy version against a development version of openssl.	2020-01-22 07:54:52 +01:00
Willy Tarreau	2086365f51	CLEANUP: pattern: remove the pat_time definition It was inherited from acl_time, introduced in 1.3.10 by commit `a84d374367` ("[MAJOR] new framework for generic ACL support") and was never ever used. Let's simply drop it now.	2020-01-22 07:44:36 +01:00
Tim Duesterhus	6a0dd73390	CLEANUP: Consistently `unsigned int` for bitfields Signed bitfields of size `1` hold the values `0` and `-1`, but are usually assigned `1`, possibly leading to subtle bugs when the value is explicitely compared against `1`.	2020-01-22 07:28:39 +01:00
Baptiste Assmann	13a9232ebc	MEDIUM: dns: use Additional records from SRV responses Most DNS servers provide A/AAAA records in the Additional section of a response, which correspond to the SRV records from the Answer section: ;; QUESTION SECTION: ;_http._tcp.be1.domain.tld. IN SRV ;; ANSWER SECTION: _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A1.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A8.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A5.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A6.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A4.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A3.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A2.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A7.domain.tld. ;; ADDITIONAL SECTION: A1.domain.tld. 3600 IN A 192.168.0.1 A8.domain.tld. 3600 IN A 192.168.0.8 A5.domain.tld. 3600 IN A 192.168.0.5 A6.domain.tld. 3600 IN A 192.168.0.6 A4.domain.tld. 3600 IN A 192.168.0.4 A3.domain.tld. 3600 IN A 192.168.0.3 A2.domain.tld. 3600 IN A 192.168.0.2 A7.domain.tld. 3600 IN A 192.168.0.7 SRV record support was introduced in HAProxy 1.8 and the first design did not take into account the records from the Additional section. Instead, a new resolution is associated to each server with its relevant FQDN. This behavior generates a lot of DNS requests (1 SRV + 1 per server associated). This patch aims at fixing this by: - when a DNS response is validated, we associate A/AAAA records to relevant SRV ones - set a flag on associated servers to prevent them from running a DNS resolution for said FADN - update server IP address with information found in the Additional section If no relevant record can be found in the Additional section, then HAProxy will failback to running a dedicated resolution for this server, as it used to do. This behavior is the one described in RFC 2782.	2020-01-22 07:19:54 +01:00
Christopher Faulet	2f5339079b	MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive It is now possible to insert any attribute when a cookie is inserted by HAProxy. Any value may be set, no check is performed except the syntax validity (CTRL chars and ';' are forbidden). For instance, it may be used to add the SameSite attribute: cookie SRV insert attr "SameSite=Strict" The attr option may be repeated to add several attributes. This patch should fix the issue #361.	2020-01-22 07:18:31 +01:00
Christopher Faulet	554c0ebffd	MEDIUM: http-rules: Support an optional error message in http deny rules It is now possible to set the error message to use when a deny rule is executed. It may be a specific error file, adding "errorfile <file>" : http-request deny deny_status 400 errorfile /etc/haproxy/errorfiles/400badreq.http It may also be an error file from an http-errors section, adding "errorfiles <name>" : http-request deny errorfiles my-errors # use 403 error from "my-errors" section When defined, this error message is set in the HTTP transaction. The tarpit rule is also concerned by this change.	2020-01-20 15:18:46 +01:00
Christopher Faulet	473e880a25	MINOR: http-ana: Add an error message in the txn and send it when defined It is now possible to set the error message to return to client in the HTTP transaction. If it is defined, this error message is used instead of proxy's errors or default errors.	2020-01-20 15:18:46 +01:00
Christopher Faulet	76edc0f29c	MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy It is now possible to import in a proxy, fully or partially, error files declared in an http-errors section. It may be done using the "errorfiles" directive, followed by a name and optionally a list of status code. If there is no status code specified, all error files of the http-errors section are imported. Otherwise, only error files associated to the listed status code are imported. For instance : http-errors my-errors errorfile 400 ... errorfile 403 ... errorfile 404 ... frontend frt errorfiles my-errors 403 404 # ==> error 400 not imported	2020-01-20 15:18:46 +01:00
Christopher Faulet	35cd81d363	MINOR: http-htx: Add a new section to create groups of custom HTTP errors A new section may now be declared in the configuration to create global groups of HTTP errors. These groups are not linked to a proxy and are referenced by name. The section must be declared using the keyword "http-errors" followed by the group name. This name must be unique. A list of "errorfile" directives may be declared in such section. For instance: http-errors website-1 errorfile 400 /path/to/site1/400.http errorfile 404 /path/to/site1/404.http http-errors website-2 errorfile 400 /path/to/site2/400.http errorfile 404 /path/to/site2/404.http For now, it is just possible to create "http-errors" sections. There is no documentation because these groups are not used yet.	2020-01-20 15:18:46 +01:00
Christopher Faulet	5885775de1	MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages All custom HTTP errors are now stored in a global tree. Proxies use a references on these messages. The key used for errorfile directives is the file name as specified in the configuration. For errorloc directives, a key is created using the redirect code and the url. This means that the same custom error message is now stored only once. It may be used in several proxies or for several status code, it is only parsed and stored once.	2020-01-20 15:18:46 +01:00
Christopher Faulet	bdf6526e94	MINOR: http-htx: Add functions to create HTX redirect message http_parse_errorloc() may now be used to create an HTTP 302 or 303 redirect message with a specific url passed as parameter. A parameter is used to known if it is a 302 or a 303 redirect. A status code is passed as parameter. It must be one of the supported HTTP error codes to be valid. Otherwise an error is returned. It aims to be used to parse "errorloc" directives. It relies on http_load_errormsg() to do most of the job, ie converting it in HTX.	2020-01-20 15:18:45 +01:00
Christopher Faulet	5031ef58ca	MINOR: http-htx: Add functions to read a raw error file and convert it in HTX http_parse_errorfile() may now be used to parse a raw HTTP message from a file. A status code is passed as parameter. It must be one of the supported HTTP error codes to be valid. Otherwise an error is returned. It aims to be used to parse "errorfile" directives. It relies on http_load_errorfile() to do most of the job, ie reading the file content and converting it in HTX.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d73b96d48c	MINOR: tcp-rules: Make tcp-request capture a custom action Now, this action is use its own dedicated function and is no longer handled "in place" during the TCP rules evaluation. Thus the action name ACT_TCP_CAPTURE is removed. The action type is set to ACT_CUSTOM and a check function is used to know if the rule depends on request contents while there is no inspect-delay.	2020-01-20 15:18:45 +01:00
Christopher Faulet	ac98d81f46	MINOR: http-rule/tcp-rules: Make track-sc* custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the TCP/HTTP rules evaluation. Thus the action names ACT_ACTION_TRK_SC0 and ACT_ACTION_TRK_SCMAX are removed. The action type is now the tracking index. Thus the function trk_idx() is no longer needed.	2020-01-20 15:18:45 +01:00
Christopher Faulet	91b3ec13c6	MEDIUM: http-rules: Make early-hint custom actions Now, the early-hint action uses its own dedicated action and is no longer handled "in place" during the HTTP rules evaluation. Thus the action name ACT_HTTP_EARLY_HINT is removed. In additionn, http_add_early_hint_header() and http_reply_103_early_hints() are also removed. This part is now handled in the new action_ptr callback function.	2020-01-20 15:18:45 +01:00
Christopher Faulet	046cf44f6c	MINOR: http-rules: Make set/del-map and add/del-acl custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP__ACL and ACT_HTTP__MAP are removed. The action type is now mapped as following: 0 = add-acl, 1 = set-map, 2 = del-acl and 3 = del-map.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d1f27e3394	MINOR: http-rules: Make set-header and add-header custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_SET_HDR and ACT_HTTP_ADD_VAL are removed. The action type is now set to 0 to set a header (so remove existing ones if any and add a new one) or to 1 to add a header (add without remove).	2020-01-20 15:18:45 +01:00
Christopher Faulet	92d34fe38d	MINOR: http-rules: Make replace-header and replace-value custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_REPLACE_HDR and ACT_HTTP_REPLACE_VAL are removed. The action type is now set to 0 to evaluate the whole header or to 1 to evaluate every comma-delimited values. The function http_transform_header_str() is renamed to http_replace_hdrs() to be more explicit and the function http_transform_header() is removed. In fact, this last one is now more or less the new action function. The lua code has been updated accordingly to use http_replace_hdrs().	2020-01-20 15:18:45 +01:00
Christopher Faulet	006f6507d7	MINOR: actions: Use an integer to set the action type <action> field in the act_rule structure is now an integer. The act_name values are used for all actions without action function (but it is not a pre-requisit though) or the action will have no effect. But for all other actions, any integer value may used, only the action function will take care of it. The default for such actions is ACT_CUSTOM.	2020-01-20 15:18:45 +01:00
Christopher Faulet	245cf795c1	MINOR: actions: Add flags to configure the action behaviour Some flags can now be set on an action when it is registered. The flags are defined in the act_flag enum. For now, only ACT_FLAG_FINAL may be set on an action to specify if it stops the rules evaluation. It is set on ACT_ACTION_ALLOW, ACT_ACTION_DENY, ACT_HTTP_REQ_TARPIT, ACT_HTTP_REQ_AUTH, ACT_HTTP_REDIR and ACT_TCP_CLOSE actions. But, when required, it may also be set on custom actions. Consequently, this flag is checked instead of the action type during the configuration parsing to trigger a warning when a rule inhibits all the following ones.	2020-01-20 15:18:45 +01:00
Christopher Faulet	105ba6cc54	MINOR: actions: Rename the act_flag enum into act_opt The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the action flags for the actions configuration.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cd26e8a2ec	MINOR: http-rules/tcp-rules: Call the defined action function first if defined When TCP and HTTP rules are evaluated, if an action function (action_ptr field in the act_rule structure) is defined for a given action, it is now always called in priority over the test on the action type. Concretly, for now, only custom actions define it. Thus there is no change. It just let us the choice to extend the action type beyond the existing ones in the enum.	2020-01-20 15:18:45 +01:00
Christopher Faulet	96bff76087	MINOR: actions: Regroup some info about HTTP rules in the same struct Info used by HTTP rules manipulating the message itself are splitted in several structures in the arg union. But it is possible to group all of them in a unique struct. Now, <arg.http> is used by most of these rules, which contains: * <arg.http.i> : an integer used as status code, nice/tos/mark/loglevel or action id. * <arg.http.str> : an IST used as header name, reason string or auth realm. * <arg.http.fmt> : a log-format compatible expression * <arg.http.re> : a regular expression used by replace rules	2020-01-20 15:18:45 +01:00
Christopher Faulet	58b3564fde	MINOR: actions: Add a function pointer to release args used by actions Arguments used by actions are never released during HAProxy deinit. Now, it is possible to specify a function to do so. ".release_ptr" field in the act_rule structure may be set during the configuration parsing to a specific deinit function depending on the action type.	2020-01-20 15:18:45 +01:00
Christopher Faulet	e00d06c99f	MINOR: http-rules: Handle all message rewrites the same way In HTTP rules, error handling during a rewrite is now handle the same way for all rules. First, allocation errors are reported as internal errors. Then, if soft rewrites are allowed, rewrite errors are ignored and only the failed_rewrites counter is incremented. Otherwise, when strict rewrites are mandatory, interanl errors are returned. For now, only soft rewrites are supported. Note also that the warning sent to notify a rewrite failure was removed. It will be useless once the strict rewrites will be possible.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a00071e2e5	MINOR: http-ana: Add a txn flag to support soft/strict message rewrites the HTTP_MSGF_SOFT_RW flag must now be set on the HTTP transaction to ignore rewrite errors on a message, from HTTP rules. The mode is called the soft rewrites. If thes flag is not set, strict rewrites are performed. In this mode, if a rewrite error occurred, an internal error is reported. For now, HTTP_MSGF_SOFT_RW is always set and there is no way to switch a transaction in strict mode.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a08546bb5a	MINOR: counters: Remove failed_secu counter and use denied_resp instead The failed_secu counter is only used for the servers stats. It is used to report the number of denied responses. On proxies, the same info is stored in the denied_resp counter. So, it is more consistent to use the same field for servers.	2020-01-20 15:18:45 +01:00
Christopher Faulet	0159ee4032	MINOR: stats: Report internal errors in the proxies/listeners/servers stats The stats field ST_F_EINT has been added to report internal errors encountered per proxy, per listener and per server. It appears in the CLI export and on the HTML stats page.	2020-01-20 15:18:45 +01:00
Christopher Faulet	30a2a3724b	MINOR: http-rules: Add more return codes to let custom actions act as normal ones When HTTP/TCP rules are evaluated, especially HTTP ones, some results are possible for normal actions and not for custom ones. So missing return codes (ACT_RET_) have been added to let custom actions act as normal ones. Concretely following codes have been added: * ACT_RET_DENY : deny the request/response. It must be handled by the caller * ACT_RET_ABRT : abort the request/response, handled by action itsleft. * ACT_RET_INV : invalid request/response	2020-01-20 15:18:45 +01:00
Christopher Faulet	4d90db5f4c	MINOR: http-rules: Add a rule result to report internal error Now, when HTTP rules are evaluated, HTTP_RULE_RES_ERROR must be returned when an internal error is catched. It is a way to make the difference between a bad request or a bad response and an error during its processing.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d4ce6c2957	MINOR: counters: Add a counter to report internal processing errors This counter, named 'internal_errors', has been added in frontend and backend counters. It should be used when a internal error is encountered, instead for failed_req or failed_resp.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cb5501327c	BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules Functions to deinitialize the HTTP rules are buggy. These functions does not check the action name to release the right part in the arg union. Only few info are released. For auth rules, the realm is released and there is no problem here. But the regex <arg.hdr_add.re> is always unconditionally released. So it is easy to make these functions crash. For instance, with the following rule HAProxy crashes during the deinit : http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)] For now, These functions are simply removed and we rely on the deinit function used for TCP rules (renamed as deinit_act_rules()). This patch fixes the bug. But arguments used by actions are not released at all, this part will be addressed later. This patch must be backported to all stable versions.	2020-01-20 15:18:45 +01:00
Willy Tarreau	ee1a6fc943	MINOR: connection: make the last arg of subscribe() a struct wait_event* The subscriber used to be passed as a "void param" that was systematically cast to a struct wait_event. By now it appears clear that the subscribe() call at every layer is well defined and always takes a pointer to an event subscriber of type wait_event, so let's enforce this in the functions' prototypes, remove the intermediary variables used to cast it and clean up the comments to clarify what all these functions do in their context.	2020-01-17 18:30:37 +01:00
Willy Tarreau	7872d1fc15	MEDIUM: connection: merge the send_wait and recv_wait entries In practice all callers use the same wait_event notification for any I/O so instead of keeping specific code to handle them separately, let's merge them and it will allow us to create new events later.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3a9312af8f	REORG: stream/backend: move backend-specific stuff to backend.c For more than a decade we've kept all the sess_update_st_*() functions in stream.c while they're only there to work in relation with what is currently being done in backend.c (srv_redispatch_connect, connect_server, etc). Let's move all this pollution over there and take this opportunity to try to find slightly less confusing names for these old functions whose role is only to handle transitions from one specific stream-int state: sess_update_st_rdy_tcp() -> back_handle_st_rdy() sess_update_st_con_tcp() -> back_handle_st_con() sess_update_st_cer() -> back_handle_st_cer() sess_update_stream_int() -> back_try_conn_req() sess_prepare_conn_req() -> back_handle_st_req() sess_establish() -> back_establish() The last one remained in stream.c because it's more or less a completion function which does all the initialization expected on a connection success or failure, can set analysers and emit logs. The other ones could possibly slightly benefit from being modified to take a stream-int instead since it's really what they're working with, but it's unimportant here.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3381bf89e3	MEDIUM: connection: get rid of CO_FL_CURR_* flags These ones used to serve as a set of switches between CO_FL_SOCK_* and CO_FL_XPRT_, and now that the SOCK layer is gone, they're always a copy of the last know CO_FL_XPRT_ ones that is resynchronized before I/O events by calling conn_refresh_polling_flags(), and that are pushed back to FDs when detecting changes with conn_xprt_polling_changes(). While these functions are not particularly heavy, what they do is totally redundant by now because the fd_want_/fd_stop_() actions already perform test-and-set operations to decide to create an entry or not, so they do the exact same thing that is done by conn_xprt_polling_changes(). As such it is pointless to call that one, and given that the only reason to keep CO_FL_CURR_* is to detect changes there, we can now remove them. Even if this does only save very few cycles, this removes a significant complexity that has been responsible for many bugs in the past, including the last one affecting FreeBSD. All tests look good, and no performance regressions were observed.	2020-01-17 17:45:12 +01:00
Willy Tarreau	e2a0eeca77	MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared by the stream-int when splicing is disabled, as well as in conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could be attempted by the I/O callbacks called from conn_fd_handler(). This clearing in conn_refresh_polling_flags() makes no sense anymore and is in no way related to the polling at all. Since we don't call them from there anymore it's better to clear it before attempting to receive, and to set it again later. So let's move this operation where it should be, in raw_sock_to_pipe() so that it's now symmetric. It was also placed in raw_sock_to_buf() so that we're certain that it gets cleared if an attempt to splice is replaced with a subsequent attempt to recv(). And these were currently already achieved by the call to conn_refresh_polling_flags(). Now it could theorically be removed from the stream-int.	2020-01-17 17:19:27 +01:00
Willy Tarreau	17ccd1a356	BUG/MEDIUM: connection: add a mux flag to indicate splice usability Commit `c640ef1a7d` ("BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible") fixed splicing in TCP and legacy mode but broke it badly in HTX mode. What happens in HTX mode is that the channel's to_forward value remains set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is not a reliable signal anymore to indicate whether more data are expected or not. Thus, when data are spliced out of the mux using rcv_pipe(), even when the end is reached (that only the mux knows about), the call to rcv_buf() to get the final HTX blocks completing the message were skipped and there was often no new event to wake this up, resulting in transfer timeouts at the end of large objects. All this goes down to the fact that the channel has no more information about whether it can splice or not despite being the one having to take the decision to call rcv_pipe() or not. And we cannot afford to call rcv_buf() inconditionally because, as the commit above showed, this reduces the forwarding performance by 2 to 3 in TCP and legacy modes due to data lying in the buffer preventing splicing from being used later. The approach taken by this patch consists in offering the muxes the ability to report a bit more information to the upper layers via the conn_stream. This information could simply be to indicate that more data are awaited but the real need being to distinguish splicing and receiving, here instead we clearly report the mux's willingness to be called for splicing or not. Hence the flag's name, CS_FL_MAY_SPLICE. The mux sets this flag when it knows that its buffer is empty and that data waiting past what is currently known may be spliced, and clears it when it knows there's no more data or that the caller must fall back to rcv_buf() instead. The stream-int code now uses this to determine if splicing may be used or not instead of looking at the rcv_pipe() callbacks through the whole chain. And after the rcv_pipe() call, it checks the flag again to decide whether it may safely skip rcv_buf() or not. All this bitfield dance remains a bit complex and it starts to appear obvious that splicing vs reading should be a decision of the mux based on permission granted by the data layer. This would however increase the API's complexity but definitely need to be thought about, and should even significantly simplify the data processing layer. The way it was integrated in mux-h1 will also result in no more calls to rcv_pipe() on chunked encoded data, since these ones are currently disabled at the mux level. However once the issue with chunks+splice is fixed, it will be important to explicitly check for curr_len\|CHNK to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk. This fix must be backported to 2.1 and 2.0.	2020-01-17 17:00:12 +01:00
Willy Tarreau	340b07e868	BUG/MAJOR: hashes: fix the signedness of the hash inputs Wietse Venema reported in the thread below that we have a signedness issue with our hashes implementations: due to the use of const char* for the input key that's often text, the crc32, sdbm, djb2, and wt6 algorithms return a platform-dependent value for binary input keys containing bytes with bit 7 set. This means that an ARM or PPC platform will hash binary inputs differently from an x86 typically. Worse, some algorithms are well defined in the industry (like CRC32) and do not provide the expected result on x86, possibly causing interoperability issues (e.g. a user-agent would fail to compare the CRC32 of a message body against the one computed by haproxy). Fortunately, and contrary to the first impression, the CRC32c variant used in the PROXY protocol processing is not affected. Thus the impact remains very limited (the vast majority of input keys are text-based, such as user-agent headers for exmaple). This patch addresses the issue by fixing all hash functions' prototypes (even those not affected, for API consistency). A reg test will follow in another patch. The vast majority of users do not use these hashes. And among those using them, very few will pass them on binary inputs. However, for the rare ones doing it, this fix MAY have an impact during the upgrade. For example if the package is upgraded on one LB then on another one, and the CRC32 of a binary input is used as a stick table key (why?) then these CRCs will not match between both nodes. Similarly, if "hash-type ... crc32" is used, LB inconsistency may appear during the transition. For this reason it is preferable to apply the patch on all nodes using such hashes at the same time. Systems upgraded via their distros will likely observe the least impact since they're expected to be upgraded within a short time frame. And it is important for distros NOT to skip this fix, in order to avoid distributing an incompatible implementation of a hash. This is the reason why this patch is tagged as MAJOR, eventhough it's extremely unlikely that anyone will ever notice a change at all. This patch must be backported to all supported branches since the hashes were introduced in 1.5-dev20 (commit `98634f0c`). Some parts may be dropped since implemented later. Link to Wietse's report: https://marc.info/?l=postfix-users&m=157879464518535&w=2	2020-01-16 08:23:42 +01:00
Willy Tarreau	f31af9367e	MEDIUM: lua: don't call the GC as often when dealing with outgoing connections In order to properly close connections established from Lua in case a Lua context dies, the context currently automatically gets a flag HLUA_MUST_GC set whenever an outgoing connection is used. This causes the GC to be enforced on the context's death as well as on yield. First, it does not appear necessary to do it when yielding, since if the connections die they are already cleaned up. Second, the problem with the flag is that even if a connection gets properly closed, the flag is not removed and the GC continues to be called on the Lua context. The impact on performance looks quite significant, as noticed and diagnosed by Sadasiva Gujjarlapudi in the following thread: https://www.mail-archive.com/haproxy@formilux.org/msg35810.html This patch changes the flag for a counter so that each created connection increments it and each cleanly closed connection decrements it. That way we know we have to call the GC on the context's death only if the count is non-null. As reported in the thread above, the Lua performance gain is now over 20% by doing this. Thanks to Sada and Thierry for the design discussion and tests that led to this solution.	2020-01-14 10:12:31 +01:00
Olivier Houchard	3c4f40acbf	BUG/MEDIUM: tasks: Use the MT macros in tasklet_free(). In tasklet_free(), to attempt to remove ourself, use MT_LIST_DEL, we can't just use LIST_DEL(), as we theorically could be in the shared tasklet list. This should be backported to 2.1.	2020-01-10 16:56:59 +01:00
Florian Tham	9205fea13a	MINOR: http: Add 404 to http-request deny This patch adds http status code 404 Not Found to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Florian Tham	272e29b5cc	MINOR: http: Add 410 to http-request deny This patch adds http status code 410 Gone to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Willy Tarreau	eaf05be0ee	OPTIM: polling: do not create update entries for FD removal In order to reduce the number of poller updates, we can benefit from the fact that modern pollers use sampling to report readiness and that under load they rarely report the same FD multiple times in a row. As such it's not always necessary to disable such FDs especially when we're almost certain they'll be re-enabled again and will require another set of syscalls. Now instead of creating an update for a (possibly temporary) removal, we only perform this removal if the FD is reported again as ready while inactive. In addition this is performed via another update so that alternating workloads like transfers have a chance to re-enable the FD without any syscall during the loop (typically after the data that filled a buffer have been sent). However we only do that for single- threaded FDs as the other ones require a more complex setup and are not on the critical path. This does cause a few spurious wakeups but almost totally eliminates the calls to epoll_ctl() on connections seeing intermitent traffic like HTTP/1 to a server or client. A typical example with 100k requests for 4 kB objects over 200 connections shows that the number of epoll_ctl() calls doesn't depend on the number of requests anymore but most exclusively on the number of established connections: Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 57.09 0.499964 0 654361 321190 recvfrom 38.33 0.335741 0 369097 1 epoll_wait 4.56 0.039898 0 44643 epoll_ctl 0.02 0.000211 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.875814 1068301 321391 total After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 59.25 0.504676 0 657600 323630 recvfrom 40.68 0.346560 0 374289 1 epoll_wait 0.04 0.000370 0 620 epoll_ctl 0.03 0.000228 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.851834 1032709 323831 total As expected there is also a slight increase of epoll_wait() calls since delaying de-activation of events can occasionally cause one spurious wakeup.	2019-12-27 16:38:47 +01:00
Willy Tarreau	19689882e6	MINOR: poller: do not call the IO handler if the FD is not active For now this almost never happens but with subsequent patches it will become more important not to uselessly call the I/O handlers if the FD is not active.	2019-12-27 16:38:47 +01:00
Willy Tarreau	0fbc318e24	CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE Both flags became equal in commit `82967bf9` ("MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the overlap between xprt_done_cb() and wake() after the removal of the DATA specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the "_DONE" version already covers everything and explains the intent well enough.	2019-12-27 16:38:47 +01:00
Willy Tarreau	4970e5adb7	REORG: connection: move tcp_connect_probe() to conn_fd_check() The function is not TCP-specific at all, it covers all FD-based sockets so let's move this where other similar functions are, in connection.c, and rename it conn_fd_check().	2019-12-27 16:38:43 +01:00
Willy Tarreau	11ef0837af	MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP In practice it's all pollers except select(). It turns out that we're keeping some legacy code only for select and enforcing it on all pollers, let's offer the pollers the ability to declare that they do not need that.	2019-12-27 14:04:33 +01:00
Lukas Tribus	a26d1e1324	BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled with the no-deprecated option. Remove existing, incomplete guards and add a compatibility macro in openssl-compat.h, just as OpenSSL does: `bf4006a6f9/include/openssl/ssl.h (L1486)` This should be backported as far as 2.0 and probably even 1.9.	2019-12-21 06:46:55 +01:00
Rosen Penev	b3814c2ca8	BUG/MINOR: ssl: openssl-compat: Fix getm_ defines LIBRESSL_VERSION_NUMBER evaluates to 0 under OpenSSL, making the condition always true. Check for the define before checking it. Signed-off-by: Rosen Penev <rosenp@gmail.com> [wt: to be backported as far as 1.9]	2019-12-20 16:01:31 +01:00
Willy Tarreau	dd0e89a084	BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing Since 1.9 with commit `b20aa9eef3` ("MAJOR: tasks: create per-thread wait queues") a task bound to a single thread will not use locks when being queued or dequeued because the wait queue is assumed to be the owner thread's. But there exists a rare situation where this is not true: the health check tasks may be running on one thread waiting for a response, and may in parallel be requeued by another thread calling health_adjust() after a detecting a response error in traffic when "observe l7" is set, and "fastinter" is lower than "inter", requiring to shorten the running check's timeout. In this case, the task being requeued was present in another thread's wait queue, thus opening a race during task_unlink_wq(), and gets requeued into the calling thread's wait queue instead of the running one's, opening a second race here. This patch aims at protecting against the risk of calling task_unlink_wq() from one thread while the task is queued on another thread, hence unlocked, by introducing a new TASK_SHARED_WQ flag. This new flag indicates that a task's position in the wait queue may be adjusted by other threads than then one currently executing it. This means that such WQ manipulations must be performed under a lock. There are two types of such tasks: - the global ones, using the global wait queue (technically speaking, those whose thread_mask has at least 2 bits set). - some local ones, which for now will be placed into the global wait queue as well in order to benefit from its lock. The flag is automatically set on initialization if the task's thread mask indicates more than one thread. The caller must also set it if it intends to let other threads update the task's expiration delay (e.g. delegated I/Os), or if it intends to change the task's affinity over time as this could lead to the same situation. Right now only the situation described above seems to be affected by this issue, and it is very difficult to trigger, and even then, will often have no visible effect beyond stopping the checks for example once the race is met. On my laptop it is feasible with the following config, chained to httpterm: global maxconn 400 # provoke FD errors, calling health_adjust() defaults mode http timeout client 10s timeout server 10s timeout connect 10s listen px bind :8001 option httpchk /?t=50 server sback 127.0.0.1:8000 backup server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7 This patch will automatically address the case for the checks because check tasks are created with multiple threads bound and will get the TASK_SHARED_WQ flag set. If in the future more tasks need to rely on this (multi-threaded muxes for example) and the use of the global wait queue becomes a bottleneck again, then it should not be too difficult to place locks on the local wait queues and queue the task on its bound thread. This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task". Many thanks to William Dauchy for providing detailed traces allowing to spot the problem.	2019-12-19 14:42:22 +01:00
Christopher Faulet	76014fd118	MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state During H1 parsing, the HTX EOM block is added before switching the message state to H1_MSG_DONE. It is an exception in the way to convert an H1 message to HTX. Except for this block, the message is first switched to the right state before starting to add the corresponding HTX blocks. For instance, the message is switched in H1_MSG_DATA state and then the HTX DATA blocks are added. With this patch, the message is switched to the H1_MSG_DONE state when all data blocks or trailers were processed. It is the caller responsibility to call h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far easier to catch failures when the HTX buffer is full. The H1 and FCGI muxes have been updated accordingly. This patch may eventually be backported to 2.1 if it helps other backports.	2019-12-11 16:46:16 +01:00
Willy Tarreau	fec56c6a76	BUG/MINOR: listener: fix off-by-one in state name check As reported in issue #380, the state check in listener_state_str() is invalid as it allows state value 9 to report crap. We don't use such a state value so the issue should never happen unless the memory is already corrupted, but better clean this now while it's harmless. This should be backported to all maintained branches.	2019-12-11 15:51:37 +01:00
Willy Tarreau	d26c9f9465	BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers If a new process is started with -sf and it fails to bind, it may send a SIGTTOU to the master process in hope that it will temporarily unbind. Unfortunately this one doesn't catch it and stops to background instead of forwarding the signal to the workers. The same is true for SIGTTIN. This commit simply implements an extra signal handler for the master to deal with such signals that must be passed down to the workers. It must be backported as far as 1.8, though there the code differs in that it's entirely in haproxy.c and doesn't require an extra sig handler.	2019-12-11 14:26:53 +01:00
Willy Tarreau	c49ba52524	MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups We used to have wake_expired_tasks() wake up tasks and return the next expiration delay. The problem this causes is that we have to call it just before poll() in order to consider latest timers, but this also means that we don't wake up all newly expired tasks upon return from poll(), which thus systematically requires a second poll() round. This is visible when running any scheduled task like a health check, as there are systematically two poll() calls, one with the interval, nothing is done after it, and another one with a zero delay, and the task is called: listen test bind *:8001 server s1 127.0.0.1:1111 check 09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0 09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0 09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0 >> nothing run here, as the expired task was not woken up yet. 09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0 09:37:39.202505 epoll_wait(3, [], 200, 0) = 0 09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0 >> now the expired task was woken up 09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0 09:37:39.202683 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0 09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:37:39.202715 close(7) = 0 Let's instead split the function in two parts: - the first part, wake_expired_tasks(), called just before process_runnable_tasks(), wakes up all expired tasks; it doesn't compute any timeout. - the second part, next_timer_expiry(), called just before poll(), only computes the next timeout for the current thread. Thanks to this, all expired tasks are properly woken up when leaving poll, and each poll call's timeout remains up to date: 09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0 09:41:16.270457 epoll_wait(3, [], 200, 999) = 0 09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0 09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0 09:41:17.270323 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0 09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:41:17.270367 close(7) = 0 This may be backported to 2.1 and 2.0 though it's unlikely to bring any user-visible improvement except to clarify debugging.	2019-12-11 09:42:58 +01:00
Willy Tarreau	440d09b244	BUG/MINOR: tasks: only requeue a task if it was already in the queue Commit `0742c314c3` ("BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().") had a slight side effect on expired timeouts, which is that when used before a timeout is updated, it will cause an existing task to be requeued earlier than its expected timeout when done before being updated, resulting in the next poll wakup timeout too early or even instantly if the previous wake up was done on a timeout. This is visible in strace when health checks are enabled because there are two poll calls, one of which has a short or zero delay. The correct solution is to only requeue a task if it was already in the queue. This can be backported to all branches having the fix above.	2019-12-11 09:21:36 +01:00
Willy Tarreau	a1d97f88e0	REORG: listener: move the global listener queue code to listener.c The global listener queue code and declarations were still lying in haproxy.c while not needed there anymore at all. This complicates the code for no reason. As a result, the global_listener_queue_task and the global_listener_queue were made static.	2019-12-10 14:16:03 +01:00
Willy Tarreau	241797a3fc	MINOR: listener: split dequeue_all_listener() in two We use it half times for the global_listener_queue and half times for a proxy's queue and this requires the callers to take care of these. Let's split it in two versions, the current one working only on the global queue and another one dedicated to proxies for the per-proxy queues. This cleans up quite a bit of code.	2019-12-10 14:14:09 +01:00
Willy Tarreau	a45a8b5171	MEDIUM: init: set NO_NEW_PRIVS by default when supported HAProxy doesn't need to call executables at run time (except when using external checks which are strongly recommended against), and is even expected to isolate itself into an empty chroot. As such, there basically is no valid reason to allow a setuid executable to be called without the user being fully aware of the risks. In a situation where haproxy would need to call external checks and/or disable chroot, exploiting a vulnerability in a library or in haproxy itself could lead to the execution of an external program. On Linux it is possible to lock the process so that any setuid bit present on such an executable is ignored. This significantly reduces the risk of privilege escalation in such a situation. This is what haproxy does by default. In case this causes a problem to an external check (for example one which would need the "ping" command), then it is possible to disable this protection by explicitly adding this directive in the global section. If enabled, it is possible to turn it back off by prefixing it with the "no" keyword. Before the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" uid=0(root) gid=0(root) groups=0(root After the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?	2019-12-06 17:20:26 +01:00
Olivier Houchard	0742c314c3	BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity(). In task_set_affinity(), leave the wait_queue if any before changing the affinity, and re-enter a wait queue once it is done. If we don't do that, the task may stay in the wait queue of another thread, and we later may end up modifying that wait queue while holding no lock, which could lead to memory corruption. THis should be backported to 2.1, 2.0 and 1.9.	2019-12-05 15:11:19 +01:00
Willy Tarreau	d96f1126fe	MEDIUM: init: prevent process and thread creation at runtime Some concerns are regularly raised about the risk to inherit some Lua files which make use of a fork (e.g. via os.execute()) as well as whether or not some of bugs we fix might or not be exploitable to run some code. Given that haproxy is event-driven, any foreground activity completely stops processing and is easy to detect, but background activity is a different story. A Lua script could very well discretely fork a sub-process connecting to a remote location and taking commands, and some injected code could also try to hide its activity by creating a process or a thread without blocking the rest of the processing. While such activities should be extremely limited when run in an empty chroot without any permission, it would be better to get a higher assurance they cannot happen. This patch introduces something very simple: it limits the number of processes and threads to zero in the workers after the last thread was created. By doing so, it effectively instructs the system to fail on any fork() or clone() syscall. Thus any undesired activity has to happen in the foreground and is way easier to detect. This will obviously break external checks (whose concept is already totally insecure), and for this reason a new option "insecure-fork-wanted" was added to disable this protection, and it is suggested in the fork() error report from the checks. It is obviously recommended not to use it and to reconsider the reasons leading to it being enabled in the first place. If for any reason we fail to disable forks, we still start because it could be imaginable that some operating systems refuse to set this limit to zero, but in this case we emit a warning, that may or may not be reported since we're after the fork point. Ideally over the long term it should be conditionned by strict-limits and cause a hard fail.	2019-12-03 11:49:00 +01:00
Emmanuel Hocdet	e9a100e982	BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0 Commit `d4f9a60e` "MINOR: ssl: deduplicate ca-file" uses undeclared X509 functions when build with openssl < 1.1.0. Introduce this functions in openssl-compat.h . Fix issue #385.	2019-12-03 07:13:12 +01:00
Emmanuel Hocdet	d4f9a60ee2	MINOR: ssl: deduplicate ca-file Typically server line like: 'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt' load ca-certificates.crt 1000 times and stay duplicated in memory. Same case for bind line: ca-file is loaded for each certificate. Same 'ca-file' can be load one time only and stay deduplicated in memory. As a corollary, this will prevent file access for ca-file when updating a certificate via CLI.	2019-11-28 11:11:20 +01:00
Willy Tarreau	cdb27e8295	MINOR: version: this is development again, update the status It's basically a revert of commit `9ca7f8cea`.	2019-11-25 20:38:32 +01:00
Willy Tarreau	2e077f8d53	[RELEASE] Released version 2.2-dev0 Released version 2.2-dev0 with the following main changes : - exact copy of 2.1.0	2019-11-25 20:36:16 +01:00
Willy Tarreau	9ca7f8ceac	MINOR: version: indicate that this version is stable Also indicate that it will get fixes till ~Q1 2021.	2019-11-25 19:47:23 +01:00
Willy Tarreau	c22d5dfeb8	MINOR: h2: add a function to report H2 error codes as strings Just like we have frame type to string, let's have error to string to improve debugging and traces.	2019-11-25 11:34:26 +01:00
Willy Tarreau	8f3ce06f14	MINOR: ist: add ist_find_ctl() This new function looks for the first control character in a string (a char whose value is between 0x00 and 0x1F included) and returns it, or NULL if there is none. It is optimized for quickly evicting non-matching strings and scans ~0.43 bytes per cycle. It can be used as an accelerator when it's needed to look up several of these characters (e.g. CR/LF/NUL).	2019-11-25 10:33:35 +01:00
Willy Tarreau	47479eb0e7	MINOR: version: emit the link to the known bugs in output of "haproxy -v" The link to the known bugs page for the current version is built and reported there. When it is a development version (less than 2 dots), instead a link to github open issues is reported as there's no way to be sure about the current situation in this case and it's better that users report their trouble there.	2019-11-21 18:48:20 +01:00
Willy Tarreau	08dd202d73	MINOR: version: report the version status in "haproxy -v" As discussed on Discourse here: https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466 it's not always easy for end users to know the lifecycle of the version they are using. This patch introduces a "Status" line in the output of "haproxy -vv" indicating whether it's a development, stable, long-term supported version, possibly with an estimated end of life for the branch when it can be anticipated (e.g. for stable versions). This field should be adjusted when creating a major release to reflect the new status. It may make sense to backport this to other branches to clarify the situation.	2019-11-21 18:47:54 +01:00
William Lallemand	8b453912ce	MINOR: ssl: ssl_sock_prepare_ctx() return an error code Rework ssl_sock_prepare_ctx() so it fills a buffer with the error messages instead of using ha_alert()/ha_warning(). Also returns an error code (ERR_*) instead of the number of errors.	2019-11-21 17:48:11 +01:00
Daniel Corbett	f8716914c7	MEDIUM: dns: Add resolve-opts "ignore-weight" It was noted in #48 that there are times when a configuration may use the server-template directive with SRV records and simultaneously want to control weights using an agent-check or through the runtime api. This patch adds a new option "ignore-weight" to the "resolve-opts" directive. When specified, any weight indicated within an SRV record will be ignored. This is for both initial resolution and ongoing resolution.	2019-11-21 17:25:31 +01:00
Fr�d�ric L�caille	ec1c10b839	MINOR: peers: Add debugging information to "show peers". This patch adds three counters to help in debugging peers protocol issues to "peer" struct: ->no_hbt counts the number of reconnection period without receiving heartbeat ->new_conn counts the number of reconnections after ->reconnect timeout expirations. ->proto_err counts the number of protocol errors.	2019-11-19 14:48:28 +01:00
Fr�d�ric L�caille	33cab3c0eb	MINOR: peers: Add TX/RX heartbeat counters. Add RX/TX heartbeat counters to "peer" struct to have an idead about which peer is alive or not. Dump these counters values on the CLI via "show peers" command.	2019-11-19 14:48:25 +01:00
C�dric Dufour	0d7712dff0	MINOR: stick-table: allow sc-set-gpt0 to set value from an expression Allow the sc-set-gpt0 action to set GPT0 to a value dynamically evaluated from its <expr> argument (in addition to the existing static <int> alternative).	2019-11-15 18:24:19 +01:00
Willy Tarreau	869efd5eeb	BUG/MINOR: log: make "show startup-log" use a ring buffer instead The copy of the startup logs used to rely on a re-allocated memory area on the fly, that would attempt to be delivered at once over the CLI. But if it's too large (too many warnings) it will take time to start up, and may not even show up on the CLI as it doesn't fit in a buffer. The ring buffer infrastructure solves all this with no more code, let's switch to this instead. It simply requires a parsing function to attach the ring via ring_attach_cli() and all the rest is automatically handled. Initially this was imagined as a code cleanup, until a test with a config involving 100k backends and just one occurrence of "load-server-state-from-file global" in the defaults section took approx 20 minutes to parse due to the O(N^2) cost of concatenating the warnings resulting in ~1 TB of data to be copied, while it took only 0.57s with the ring. Ideally this patch should be backported to 2.0 and 1.9, though it relies on the ring infrastructure which will then also need to be backported. Configs able to trigger the bug are uncommon, so another workaround for older versions without backporting the rings would consist in simply limiting the size of the error message in print_message() to something always printable, which will only return the first errors.	2019-11-15 15:50:16 +01:00
Christopher Faulet	0d1c2a65e8	MINOR: stats: Report max times in addition of the averages for sessions Now, for the sessions, the maximum times (queue, connect, response, total) are reported in addition of the averages over the last 1024 connections. These values are called qtime_max, ctime_max, rtime_max and ttime_max. This patch is related to #272.	2019-11-15 14:23:54 +01:00
Christopher Faulet	efb41f0d8d	MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time For backends and servers, some average times for last 1024 connections are already calculated. For the moment, the averages for the time passed in the queue, the connect time, the response time (for HTTP session only) and the total time are calculated. Now, in addition, the maximum time observed for these values are also stored. In addition, These new counters are cleared as all other max values with the CLI command "clear counters". This patch is related to #272.	2019-11-15 14:23:21 +01:00
Christopher Faulet	e2e8c6779e	MINOR: freq_ctr: Make the sliding window sums thread-safe swrate_add() and swrate_add_scaled() now rely on the CAS atomic operation. So the sliding window sums are atomically updated.	2019-11-15 13:43:08 +01:00
Christopher Faulet	b2e58492b1	MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX streams This change make the payload filtering uniform between TCP and HTTP filters. Now, in TCP, like in HTTP, there is only one callback responsible to forward data. Thus, old callbacks, tcp_data() and tcp_forward_data(), are replaced by a single callback function, tcp_payload(). This new callback gets the offset in the payload to (re)start the filtering and the maximum amount of data it can forward. It is the filter's responsibility to be compatible with HTX streams. If not, it must not set the flag FLT_CFG_FL_HTX. Because of this change, nxt and fwd offsets are no longer needed. Thus they are removed from the filter structure with their update functions, flt_change_next_size() and flt_change_forward_size(). Moreover, the trace filter has been updated accordingly. This patch breaks the compatibility with the old API. Thus it should probably not be backported. But, AFAIK, there is no TCP filter, thus the breakage is very limited.	2019-11-15 13:43:08 +01:00
Willy Tarreau	da52035a45	MINOR: memory: also poison the area on freeing Doing so sometimes helps detect some UAF situations without the overhead associated to the DEBUG_UAF define.	2019-11-15 07:06:46 +01:00
Olivier Houchard	7031e3dace	BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the tasklet. In tasklet_remove_from_tasket_list(), we can be called for a tasklet that is either in the private task list, or in the shared tasklet list. Take that into account and always use MT_LIST_DEL() to remove it, otherwise if we're in the shared list and another thread attempts to add a tasklet in it, bad things will happen. __tasklet_remove_from_tasklet_list() is left unchanged, it's only supposed to be used by process_runnable_task() to remove task/tasklets from the private tast list. This should not be backported. This should fix github issue #357.	2019-11-09 18:27:17 +01:00
Christopher Faulet	fee726ffa7	MINOR: http-ana: Remove the unused function http_reset_txn() Since the legacy HTTP mode was removed, the stream is always released at the end of each HTTP transaction and a new is created to handle the next request for keep-alive connections. So the HTTP transaction is no longer reset and the function http_reset_txn() can be removed.	2019-11-07 15:32:52 +01:00
Christopher Faulet	eea8fc737b	MEDIUM: stream/trace: Register a new trace source with its events Runtime traces are now supported for the streams, only if compiled with debug. process_stream() is covered as well as TCP/HTTP analyzers and filters. In traces, the first argument is always a stream. So it is easy to get the info about the channels and the stream-interfaces. The second argument, when defined, is always a HTTP transaction. And the third one is an HTTP message. The trace message is adapted to report HTTP info when possible.	2019-11-06 10:14:32 +01:00
Christopher Faulet	db703b1918	MINOR: trace: Add a set of macros to trace events if HA is compiled with debug The macros DBG_TRACE_*() can be used instead of existing trace macros to emit trace messages in debug mode only, ie, when HAProxy is compiled with DEBUG_FULL or DEBUG_DEV. Otherwise, these macros do nothing. So it is possible to add traces for development purpose without impacting performance of production instances.	2019-11-06 10:14:32 +01:00
William Lallemand	21724f0807	MINOR: ssl/cli: replace the default_ctx during 'commit ssl cert' If the SSL_CTX of a previous instance (ckch_inst) was used as a default_ctx, replace the default_ctx of the bind_conf by the first SSL_CTX inserted in the SNI tree. Use the RWLOCK of the sni tree to handle the change of the default_ctx.	2019-11-04 18:16:53 +01:00
Damien Claisse	ae6f125c7b	MINOR: sample: add us/ms support to date/http_date It can be sometimes interesting to have a timestamp with a resolution of less than a second. It is currently painful to obtain this, because concatenation of date and date_us lead to a shorter timestamp during first 100ms of a second, which is not parseable and needs ugly ACLs in configuration to prepend 0s when needed. To improve this, add an optional <unit> parameter to date sample to report an integer with desired unit. Also support this unit in http_date converter to report a date string with sub-second precision.	2019-10-31 08:47:31 +01:00
William Lallemand	beea2a476e	CLEANUP: ssl/cli: remove leftovers of bundle/certs (it < 2) Remove the leftovers of the certificate + bundle updating in 'ssl set cert' and 'commit ssl cert'. * Remove the it variable in appctx.ctx.ssl. * Stop doing everything twice. * Indent	2019-10-30 17:52:34 +01:00
William Lallemand	bc6ca7ccaa	MINOR: ssl/cli: rework 'set ssl cert' as 'set/commit' This patch splits the 'set ssl cert' CLI command into 2 commands. The previous way of updating the certificate on the CLI was limited with the bundles. It was only able to apply one of the tree part of the certificate during an update, which mean that we needed 3 updates to update a full 3 certs bundle. It was also not possible to apply atomically several part of a certificate with the ability to rollback on error. (For example applying a .pem, then a .ocsp, then a .sctl) The command 'set ssl cert' will now duplicate the certificate (or bundle) and update it in a temporary transaction.. The second command 'commit ssl cert' will commit all the changes made during the transaction for the certificate. This commit breaks the ability to update a certificate which was used as a unique file and as a bundle in the HAProxy configuration. This way of using the certificates wasn't making any sense. Example: // For a bundle: $ echo -e "set ssl cert localhost.pem.rsa <<\n$(cat kikyo.pem.rsa)\n" \| socat /tmp/sock1 - Transaction created for certificate localhost.pem! $ echo -e "set ssl cert localhost.pem.dsa <<\n$(cat kikyo.pem.dsa)\n" \| socat /tmp/sock1 - Transaction updated for certificate localhost.pem! $ echo -e "set ssl cert localhost.pem.ecdsa <<\n$(cat kikyo.pem.ecdsa)\n" \| socat /tmp/sock1 - Transaction updated for certificate localhost.pem! $ echo "commit ssl cert localhost.pem" \| socat /tmp/sock1 - Committing localhost.pem. Success!	2019-10-30 17:01:07 +01:00
William Dauchy	0fec3ab7bf	MINOR: init: always fail when setrlimit fails this patch introduces a strict-limits parameter which enforces the setrlimit setting instead of a warning. This option can be forcingly disable with the "no" keyword. The general aim of this patch is to avoid bad surprises on a production environment where you change the maxconn for example, a new fd limit is calculated, but cannot be set because of sysfs setting. In that case you might want to have an explicit failure to be aware of it before seeing your traffic going down. During a global rollout it is also useful to explictly fail as most progressive rollout would simply check the general health check of the process. As discussed, plan to use the strict by default mode starting from v2.3. Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2019-10-29 17:42:27 +01:00
Olivier Houchard	6e8e2ec849	BUG/MEDIUM: stream_interface: Only use SI_ST_RDY when the mux is ready. In si_connect(), only switch the strema_interface status to SI_ST_RDY if we're reusing a connection and if the connection's mux is ready. Otherwise, maybe we're reusing a connection that is not fully established yet, and may fail, and setting SI_ST_RDY would mean we would not be able to retry to connect. This should be backported to 1.9 and 2.0. This commit depends on 55234e33708c5a584fb9efea81d71ac47235d518.	2019-10-29 14:15:20 +01:00
Olivier Houchard	9b8e11e691	MINOR: mux: Add a new method to get informations about a mux. Add a new method, ctl(), to muxes. It uses a "enum mux_ctl_type" to let it know which information we're asking for, and can output it either directly by returning the expected value, or by using an optional argument. "output" argument. Right now, the only known mux_ctl_type is MUX_STATUS, that will return 0 if the mux is not ready, or MUX_STATUS_READY if the mux is ready. We probably want to backport this to 1.9 and 2.0.	2019-10-29 14:15:20 +01:00
Willy Tarreau	2254b8ef4a	Revert "MINOR: istbuf: add b_fromist() to make a buffer from an ist" This reverts commit `9e46496d45`. It was wrong and is not reliable, depending on the compiler's version and optimization, as the struct is assigned inside a statement, thus on its own stack. It's not needed anymore now so let's remove this.	2019-10-29 13:09:14 +01:00
Willy Tarreau	20020ae804	MINOR: chunk: add chunk_istcat() to concatenate an ist after a chunk We previously relied on chunk_cat(dst, b_fromist(src)) for this but it is not reliable as the allocated buffer is inside the expression and may be on a temporary stack. While it's possible to allocate stack space for a struct and return a pointer to it, it's not possible to initialize it form a temporary variable to prevent arguments from being evaluated multiple times. Since this is only used to append an ist after a chunk, let's instead have a chunk_istcat() function to perform exactly this from a native ist. The only call place (URI computation in the cache) was updated.	2019-10-29 13:09:14 +01:00
Willy Tarreau	9b013701f1	MINOR: stats/debug: maintain a counter of debug commands issued Debug commands will usually mark the fate of the process. We'd rather have them counted and visible in a core or in stats output than trying to guess how a flag combination could happen. The counter is only incremented when the command is about to be issued however, so that failed attempts are ignored.	2019-10-24 18:38:00 +02:00
Willy Tarreau	abb9f9b057	MINOR: cli: add an expert mode to hide dangerous commands Some commands like the debug ones are not enabled by default but can be useful on some production environments. In order to avoid the temptation of using them incorrectly, let's introduce an "expert" mode for a CLI connection, which allows some commands to appear and be used. It is enabled by command "expert-mode on" which is not listed by default.	2019-10-24 18:38:00 +02:00
Willy Tarreau	86bfe146c9	REORG: move CLI access level definitions to cli.h These ones were still in global.h which is misplaced.	2019-10-24 18:38:00 +02:00
William Lallemand	705e088f0a	BUG/MINOR: ssl: fix build of X509_chain_up_ref() w/ libreSSL LibreSSL brought X509_chain_up_ref() in 2.7.5, so no need to build our own version starting from this version.	2019-10-23 23:20:08 +02:00
William Lallemand	89f5807315	BUG/MINOR: ssl: fix build with openssl < 1.1.0 `8c1cddef` ("MINOR: ssl: new functions duplicate and free a ckch_store") use some OpenSSL refcount functions that were introduced in OpenSSL 1.0.2 and OpenSSL 1.1.0. Fix the problem by introducing them in openssl-compat.h. Fix #336.	2019-10-23 19:44:50 +02:00
William Lallemand	8f840d7e55	MEDIUM: cli/ssl: handle the creation of SSL_CTX in an IO handler To avoid affecting too much the traffic during a certificate update, create the SNIs in a IO handler which yield every 10 ckch instances. This way haproxy continues to respond even if we tries to update a certificate which have 50 000 instances.	2019-10-23 11:54:51 +02:00
Willy Tarreau	403bfbb130	BUG/MEDIUM: pattern: make the pattern LRU cache thread-local and lockless As reported in issue #335, a lot of contention happens on the PATLRU lock when performing expensive regex lookups. This is absurd since the purpose of the LRU cache was to have a fast cache for expressions, thus the cache must not be shared between threads and must remain lockless. This commit makes the LRU cache thread-local and gets rid of the PATLRU lock. A test with 7 threads on 4 cores climbed from 67kH/s to 369kH/s, or a scalability factor of 5.5. Given the huge performance difference and the regression caused to users migrating from processes to threads, this should be backported at least to 2.0. Thanks to Brian Diekelman for his detailed report about this regression.	2019-10-23 07:27:25 +02:00
Willy Tarreau	8cdc167df8	BUG/MEDIUM: task: make tasklets either local or shared but not both at once Tasklets may be woken up to run on the calling thread or by a specific thread (the owner). But since we use a non-thread safe mechanism when the calling thread is also the for the owner, there may sometimes be collisions when two threads decide to wake the same tasklet up at the same time and one of them is the owner. This is more of a matter of usage than code, in that a tasklet usually is designed to be woken up and executed on the calling thread only (most cases) or on a specific thread. Thus it is a property of the tasklet itself as this solely depends how the code is constructed around it. This patch performs a small change to address this. By default tasklet_new() creates a "local" tasklet, which will run on the calling thread, like in 2.0. This is done by setting tl->tid to a negative value. If the caller wants the tasklet to run exclusively on a specific thread, it just has to set tl->tid, which is already what shared tasklet callers do anyway. No backport is needed.	2019-10-18 09:04:55 +02:00
Willy Tarreau	891b5ef05a	BUG/MEDIUM: tasklet: properly compute the sleeping threads mask in tasklet_wakeup() The use of ~(1 << tid) to compute the sleeping_mask in tasklet_wakeup() will result in breakage above 32 threads, because (1<<31) = 0xFFFFFFFF8000000, and upper values will lead to theorically undefined results, but practically will wrap over 0x1 to 0x80000000 again and indicate wrong sleeping masks. It seems that the main visible effect maybe extra latency on some threads or short CPU loops on others. No backport is needed.	2019-10-18 09:00:26 +02:00
Olivier Houchard	2068ec4f89	BUG/MEDIUM: lists: Handle 1-element-lists in MT_LIST_BEHEAD(). In MT_LIST_BEHEAD(), explicitely set the next element of the prev to NULL, instead of setting it to the prev of the next. If we only had one element, then we'd set the next and the prev to the element itself, and thus it would make the element appear to be outside any list.	2019-10-17 17:48:20 +02:00
Willy Tarreau	9e46496d45	MINOR: istbuf: add b_fromist() to make a buffer from an ist A lot of our chunk-based functions are able to work on a buffer pointer but not on an ist. Instead of duplicating all of them to also take an ist as a source, let's have a macro to make a temporary dummy buffer from an ist. This will only result in structure field manipulations that the compiler will quickly figure to eliminate them with inline functions, and in other cases it will just use 4 words in the stack before calling a function, instead of performing intermediary conversions.	2019-10-17 10:40:47 +02:00
David Carlier	a92c5cec2d	BUILD/MEDIUM: threads: rename thread_info struct to ha_thread_info On Darwin, the thread_info name exists as a standard function thus we need to rename our array to ha_thread_info to fix this conflict.	2019-10-17 07:15:17 +02:00
Christopher Faulet	065118166c	MINOR: htx: Add a flag on HTX to known when a response was generated by HAProxy The flag HTX_FL_PROXY_RESP is now set on responses generated by HAProxy, excluding responses returned by applets and services. It is an informative flag set by the applicative layer.	2019-10-16 10:03:12 +02:00
Willy Tarreau	abefa34c34	MINOR: version: make the version strings variables, not constants It currently is not possible to figure the exact haproxy version from a core file for the sole reason that the version is stored into a const string and as such ends up in the .text section that is not part of a core file. By turning them into variables we move them to the data section and they appear in core files. In order to help finding them, we just prepend an extra variable in front of them and we're able to immediately spot the version strings from a core file: $ strings core \| fgrep -A2 'HAProxy version' HAProxy version follows 2.1-dev2-e0f48a-88 2019/10/15 (These are haproxy_version and haproxy_date respectively). This may be backported to 2.0 since this part is not support to impact anything but the developer's time spent debugging.	2019-10-16 09:56:57 +02:00
Christopher Faulet	53a899b946	CLEANUP: h1-htx: Move htx-to-h1 formatting functions from htx.c to h1_htx.c The functions "htx__to_h1()" have been renamed into "h1_format_htx_()" and moved in the file h1_htx.c. It is the right place for such functions.	2019-10-14 22:28:50 +02:00
Christopher Faulet	48fa033f28	BUG/MINOR: chunk: Fix tests on the chunk size in functions copying data When raw data are copied or appended in a chunk, the result must not exceed the chunk size but it can reach it. Unlike functions to copy or append a string, there is no terminating null byte. This patch must be backported as far as 1.8. Note in 1.8, the functions chunk_cpy() and chunk_cat() don't exist.	2019-10-14 16:45:09 +02:00
William Lallemand	e0c51ae358	BUG/MINOR: ssl: fix build without SSL Commits `222a7c6` and `150bfa8` introduced some SSL initialization in bind_conf_alloc() which broke the build without SSL. Issue #322.	2019-10-14 11:24:17 +02:00
William Lallemand	246c0246d3	MINOR: ssl: load the ocsp in/from the ckch Don't try to load the files containing the issuer and the OCSP response each time we generate a SSL_CTX. The .ocsp and the .issuer are now loaded in the struct cert_key_and_chain only once and then loaded from this structure when creating a SSL_CTX.	2019-10-11 17:32:03 +02:00
William Lallemand	a17f4116d5	MINOR: ssl: load the sctl in/from the ckch Don't try to load the file containing the sctl each time we generate a SSL_CTX. The .sctl is now loaded in the struct cert_key_and_chain only once and then loaded from this structure when creating a SSL_CTX. Note that this now make possible the use of sctl with multi-cert bundles.	2019-10-11 17:32:03 +02:00
William Lallemand	150bfa84e3	MEDIUM: ssl/cli: 'set ssl cert' updates a certificate from the CLI $ echo -e "set ssl cert certificate.pem <<\n$(cat certificate2.pem)\n" \| \ socat stdio /var/run/haproxy.stat Certificate updated! The operation is locked at the ckch level with a HA_SPINLOCK_T which prevents the ckch architecture (ckch_store, ckch_inst..) to be modified at the same time. So you can't do a certificate update at the same time from multiple CLI connections. SNI trees are also locked with a HA_RWLOCK_T so reading operations are locked only during a certificate update. Bundles are supported but you need to update each file (.rsa\|ecdsa\|.dsa) independently. If a file is used in the configuration as a bundle AND as a unique certificate, both will be updated. Bundles, directories and crt-list are supported, however filters in crt-list are currently unsupported. The code tries to allocate every SNIs and certificate instances first, so it can rollback the operation if that was unsuccessful. If you have too much instances of the certificate (at least 20000 in my tests on my laptop), the function can take too much time and be killed by the watchdog. This will be fixed later. Also with too much certificates it's possible that socat exits before the end of the generation without displaying a message, consider changing the socat timeout in this case (-t2 for example). The size of the certificate is currently limited by the maximum size of a payload, that must fit in a buffer.	2019-10-11 17:32:03 +02:00
William Lallemand	1d29c7438e	MEDIUM: ssl: split ssl_sock_add_cert_sni() In order to allow the creation of sni_ctx in runtime, we need to split the function to allow rollback. We need to be able to allocate all sni_ctxs required before inserting them in case we need to rollback if we didn't succeed the allocation. The function was splitted in 2 parts. The first one ckch_inst_add_cert_sni() allocates a struct sni_ctx, fill it with the right data and insert it in the ckch_inst's list of sni_ctx. The second will take every sni_ctx in the ckch_inst and insert them in the bind_conf's sni tree.	2019-10-11 17:32:03 +02:00
William Lallemand	9117de9e37	MEDIUM: ssl: introduce the ckch instance structure struct ckch_inst represents an instance of a certificate (ckch_node) used in a bind_conf. Every sni_ctx created for 1 ckch_node in a bind_conf are linked in this structure. This patch allocate the ckch_inst for each bind_conf and inserts the sni_ctx in its linked list.	2019-10-11 17:32:03 +02:00
William Lallemand	222a7c6ae0	MINOR: ssl: initialize explicitly the sni_ctx trees	2019-10-11 17:32:02 +02:00
William Lallemand	f6adbe9f28	REORG: ssl: move structures to ssl_sock.h	2019-10-11 17:32:02 +02:00
Olivier Houchard	804ef244c6	MINOR: lists: Fix alignement of \ when relevant. Make sure all the \ are properly aligned in macroes, this contains no functional change.	2019-10-11 16:56:25 +02:00
Olivier Houchard	74715da030	MINOR: lists: Try to use local variables instead of macro arguments. When possible, use local variables instead of using the macro arguments explicitely, otherwise they may be evaluated over and over.	2019-10-11 16:56:25 +02:00
Olivier Houchard	06910464dd	MEDIUM: task: Split the tasklet list into two lists. As using an mt_list for the tasklet list is costly, instead use a regular list, but add an mt_list for tasklet woken up by other threads, to be run on the current thread. At the beginning of process_runnable_tasks(), we just take the new list, and merge it into the task_list. This should give us performances comparable to before we started using a mt_list, but allow us to use tasklet_wakeup() from other threads.	2019-10-11 16:37:41 +02:00
Willy Tarreau	d7f2bbcbe3	MINOR: list: add new macro MT_LIST_BEHEAD This macro atomically cuts the head of a list and returns the list of elements as a detached list, meaning that they're all linked together without any head. If the list was empty, NULL is returned.	2019-10-11 16:37:41 +02:00
Willy Tarreau	c32a0e522f	MINOR: lists: add new macro LIST_SPLICE_END_DETACHED This macro adds a detached list at the end of an existing list. The detached list is a list without head, containing only elements.	2019-10-11 16:37:41 +02:00
Willy Tarreau	eaa55370c3	MINOR: stats: prepare to add a description with each stat/info field Several times some users have expressed the non-intuitive aspect of some of our stat/info metrics and suggested to add some help. This patch replaces the char* arrays with an array of name_desc so that we now have some reserved room to store a description with each stat or info field. These descriptions are currently empty and not reported yet.	2019-10-10 11:30:07 +02:00
Willy Tarreau	2f39738750	MINOR: stats: support the "desc" output format modifier for info and stat Now "show info" and "show stat" can parse "desc" as an output format modifier that will be passed down the chain to add some descriptions to the fields depending on the format in use. For now it is not exploited.	2019-10-10 11:30:07 +02:00
Willy Tarreau	ab02b3f345	MINOR: stats: get rid of the STAT_SHOWADMIN flag This flag is used to decide to show the check box in front of a proxy on the HTML stat page. It is always equal to STAT_ADMIN except when the proxy has no backend capability (i.e. a pure frontend) or has no server, in which case it's only used to avoid leaving an empty column at the beginning of the table. Not only this is pretty useless, but it also causes the columns not to align well when mixing multiple proxies with or without servers. Let's simply always use STAT_ADMIN and get rid of this flag.	2019-10-10 11:30:07 +02:00
Willy Tarreau	708c41602b	MINOR: stats: replace the ST_* uri_auth flags with STAT_* We used to rely on some config flags defined in uri_auth.h set during parsing, and another set of STAT_* flags defined in stats.h set at run time, with a somewhat gray area between the two sets. This is confusing in the stats code as both are called "flags" in various functions and it's quite hard to know which one describes what. This patch cleans this up by replacing all ST_* by a newly assigned value from the STAT_* set so that we can now use unified flags to describe both the configuration and the current state. There is no functional change at all.	2019-10-10 11:30:07 +02:00
Willy Tarreau	ee4f5f83d3	MINOR: stats: get rid of the ST_CONVDONE flag This flag was added in 1.4-rc1 by commit `329f74d463` ("[BUG] uri_auth: do not attemp to convert uri_auth -> http-request more than once") to address the case where two proxies inherit the stats settings from the defaults instance, and the first one compiles the expression while the second one uses it. In this case since they use the exact same uri_auth pointer, only the first one should compile and the second one must not fail the check. This was addressed by adding an ST_CONVDONE flag indicating that the expression conversion was completed and didn't need to be done again. But this is a hack and it becomes cumbersome in the middle of the other flags which are all relevant to the stats applet. Let's instead fix it by checking if we're dealing with an alias of the defaults instance and refrain from compiling this twice. This allows us to remove the ST_CONVDONE flag. A typical config requiring this check is : defaults mode http stats auth foo:bar listen l1 bind :8080 listen l2 bind :8181 Without this (or previous) check it would cmoplain when checking l2's validity since the rule was already built.	2019-10-10 11:30:07 +02:00
Christopher Faulet	16fdc55f79	MINOR: http: Add a function to get the authority into a URI The function http_get_authority() may be used to parse a URI and looks for the authority, between the scheme and the path. An option may be used to skip the user info (part before the '@'). Most of time, the user info will be ignored.	2019-10-09 11:05:31 +02:00
Christopher Faulet	9a67c293b9	MINOR: htx: Add 2 flags on the start-line to have more info about the uri The first flag, HTX_SL_F_HAS_AUTHORITY, is set when the uri contains an authority. For the H1, it happens when a CONNECT request is received or when an absolute uri is used. For the H2, it happens when the pseudo header ":authority" is provided. The second one, HTX_SL_F_NORMALIZED_URI, is set when the received uri is represented as an absolute uri because of the protocol requirements. For now, it is only used for h2 requests, when the pseudo headers :authority and :scheme are found. Internally, the uri is represented as an absolute uri. This flag allows us to make the difference between an absolute uri in h1 and h2.	2019-10-09 11:05:31 +02:00
Christopher Faulet	c5a3eb4e3a	MINOR: fcgi: Add function to get the string representation of a record type This function will be used to emit traces in the FCGI multiplexer.	2019-10-04 16:12:02 +02:00
Christopher Faulet	27aa65ecfb	MINOR: htx: Adapt htx_dump() to be used from traces This function now dumps info about the HTX message into a buffer, passed as argument. In addition, it is possible to only dump meta information, without the message content.	2019-10-04 15:48:55 +02:00
Christopher Faulet	af542635f7	MINOR: h1-htx: Update h1_copy_msg_data() to ease the traces in the mux-h1 This function now uses the address of the pointer to the htx message where the copy must be performed. This way, when a zero-copy is performed, there is no need to refresh the caller's htx message. It is a bit easier to do that way, especially to add traces in the mux-h1.	2019-10-04 15:46:59 +02:00
Willy Tarreau	2aaeee34da	BUG/MEDIUM: fd: HUP is an error only when write is active William reported that since commit `6b3089856f` ("MEDIUM: fd: do not use the FD_POLL_* flags in the pollers anymore") the master's CLI often fails to access sub-processes. There are two causes to this. One is that we did report FD_POLL_ERR on an FD as soon as FD_EV_SHUT_W was seen, which is automatically inherited from POLLHUP. And since we do not store the current shutdown state of an FD we can't know if the poller reports a sudden close resulting from an error or just a byproduct of a previous shutdown(WR) followed by a read0. The current patch addresses this by only considering this when the FD was active, since a shutdown FD is not active. The second issue is that somewhere down the chain, channel data are ignored if an error is reported on a channel. This results in content truncation, but this cause was not figured yet. No backport is needed.	2019-10-01 11:52:08 +02:00
Tim Duesterhus	07626eafa2	CLEANUP: proxy: Remove `proxy_tbl_by_name` It is no longer required as of `1b8e68e89a` and is no longer used when #306 is fixed.	2019-09-30 04:11:36 +02:00
Christopher Faulet	88a0db28ae	MINOR: stats: Add the support of float fields in stats It is now possible to format stats counters as floats. But the stats applet does not use it. This patch is required by the Prometheus exporter to send the time averages in seconds. If the promex change is backported, this patch must be backported first.	2019-09-27 08:49:09 +02:00
Christopher Faulet	d72665b425	CLEANUP: http-ana: Remove the unused function http_send_name_header() Because the HTTP multiplexers are now responsible to handle the option "http-send-name-header", the function http_send_name_header() can be removed.	2019-09-27 08:48:53 +02:00
Christopher Faulet	b1bb1afa47	MINOR: spoe: Support the async mode with several threads A different engine-id is now generated for each thread. So, it is possible to enable the async mode with several threads. This patch may be backported to older versions.	2019-09-26 16:51:02 +02:00
Willy Tarreau	93acfa2263	MINOR: time: add timeofday_as_iso_us() to return instant time as ISO We often need ISO time + microseconds in traces and ring buffers, thus function does this by calling gettimeofday() and keeping a cached value of the part representing the tv_sec value, and only rewrites the microsecond part. The cache is per-thread so it's lockless and safe to use as-is. Some tests already show that it's easy to see 3-4 events in a single microsecond, thus it's likely that the nanosecond version will have to be implemented as well. But certain comments on the net suggest that some parsers are having trouble beyond microsecond, thus for now let's stick to the microsecond only.	2019-09-26 08:13:38 +02:00
Olivier Houchard	bba1a263c5	BUG/MEDIUM: tasklets: Make sure we're waking the target thread if it sleeps. Now that we can wake tasklet for other threads, make sure that if the thread is sleeping, we wake it up, or the tasklet won't be executed until it's done sleeping. That also means that, before going to sleep, and after we put our bit in sleeping_thread_mask, we have to check that nobody added a tasklet for us, just checking for global_tasks_mask isn't enough anymore.	2019-09-24 14:58:45 +02:00
Willy Tarreau	d022e9c98b	MINOR: task: introduce a thread-local "sched" variable for local scheduler stuff The aim is to rassemble all scheduler information related to the current thread. It simply points to task_per_thread[tid] without having to perform the operation at each time. We save around 1.2 kB of code on performance sensitive paths and increase the request rate by almost 1%.	2019-09-24 11:23:30 +02:00
Willy Tarreau	d66d75656e	MINOR: task: split the tasklet vs task code in process_runnable_tasks() There are a number of tests there which are enforced on tasklets while they will never apply (various handlers, destroyed task or not, arguments, results, ...). Instead let's have a single TASK_IS_TASKLET() test and call the tasklet processing function directly, skipping all the rest. It now appears visible that the only unneeded code is the update to curr_task that is never used for tasklets, except for opportunistic reporting in the debug handler, which can only catch si_cs_io_cb, which in practice doesn't appear in any report so the extra cost incurred there is pointless. This change alone removes 700 bytes of code, mostly in process_runnable_tasks() and increases the performance by about 1%.	2019-09-24 11:23:30 +02:00
Willy Tarreau	2bd65a781e	OPTIM: listeners: use tasklets for the multi-queue rings Now that we can wake up a remote thread's tasklet, it's way more interesting to use a tasklet than a task in the accept queue, as it will avoid passing through all the scheduler. Just doing this increases the accept rate by about 4%, overall recovering the slight loss introduced by the tasklet change. In addition it makes sure that even a heavily loaded scheduler (e.g. many very fast checks) will not delay a connection accept.	2019-09-24 06:57:32 +02:00
Olivier Houchard	ff1e9f39b9	MEDIUM: tasklets: Make the tasklet list a struct mt_list. Change the tasklet code so that the tasklet list is now a mt_list. That means that tasklet now do have an associated tid, for the thread it is expected to run on, and any thread can now call tasklet_wakeup() for that tasklet. One can change the associated tid with tasklet_set_tid().	2019-09-23 18:16:08 +02:00
Olivier Houchard	0cd6a976ff	MINOR: mt_lists: Give MT_LIST_ADD, MT_LIST_ADDQ and MT_LIST_DEL a return value. Make it so MT_LIST_ADD and MT_LIST_ADDQ return 1 if it managed to add the item, 0 (because it was already in a list) otherwise. Make it so MT_LIST_DEL returns 1 if it managed to remove the item from a list, or 0 otherwise (because it was in no list).	2019-09-23 18:16:08 +02:00
Olivier Houchard	cb22ad4f71	MINOR: mt_lists: Do nothing in MT_LIST_ADD/MT_LIST_ADDQ if already in list. Modify MT_LIST_ADD and MT_LIST_ADDQ to do nothing if the element is already in a list.	2019-09-23 18:16:08 +02:00
Olivier Houchard	9570ecf662	MEDIUM: servers: Use LIST_DEL_INIT() instead of LIST_DEL(). In srv_add_to_idle_list(), use LIST_DEL_INIT instead of just LIST_DEL. We're about to add the connection to a mt_list, and MT_LIST_ADD/MT_LIST_ADDQ will be modified to make sure we're not adding the element if it's already in a list.	2019-09-23 18:16:08 +02:00
Olivier Houchard	5e9b92cbff	MINOR: mt_lists: Add new macroes. Add a few new macroes to the mt_lists. MT_LIST_LOCK_ELT()/MT_LIST_UNLOCK_ELT() helps locking/unlocking an element. This should only be used if you know for sure nobody else will remove the element from the list in the meanwhile. mt_list_for_each_entry_safe() is an iterator, similar to list_for_each_entry_safe(). It takes 5 arguments, item, list_head, member are similar to those of the non-mt variant, tmpelt is a temporary pointer to a struct mt_list, while tmpelt2 is a struct mt_list itself. MT_LIST_DEL_SELF() can be used to delete an item while parsing the list with mt_list_for_each_entry_safe(). It shouldn't be used outside, and you shouldn't use MT_LIST_DEL() while using mt_list_for_each_entry_safe().	2019-09-23 18:16:08 +02:00
Olivier Houchard	859dc80f94	MEDIUM: list: Separate "locked" list from regular list. Instead of using the same type for regular linked lists and "autolocked" linked lists, use a separate type, "struct mt_list", for the autolocked one, and introduce a set of macros, similar to the LIST_* macros, with the MT_ prefix. When we use the same entry for both regular list and autolocked list, as is done for the "list" field in struct connection, we know have to explicitely cast it to struct mt_list when using MT_ macros.	2019-09-23 18:16:08 +02:00
Christopher Faulet	78fbb9f991	MEDIUM: fcgi-app: Add FCGI application and filter The FCGI application handles all the configuration parameters used to format requests sent to an application. The configuration of an application is grouped in a dedicated section (fcgi-app <name>) and referenced in a backend to be used (use-fcgi-app <name>). To be valid, a FCGI application must at least define a document root. But it is also possible to set the default index, a regex to split the script name and the path-info from the request URI, parameters to set or unset... In addition, this patch also adds a FCGI filter, responsible for all processing on a stream.	2019-09-17 10:18:54 +02:00
Christopher Faulet	63bbf284a1	MINOR: fcgi: Add code related to FCGI protocol This code is independant and is only responsible to encode and decode part of the FCGI protocol.	2019-09-17 10:18:54 +02:00
Christopher Faulet	4f0f88a9d0	MEDIUM: mux-h1/h1-htx: move HTX convertion of H1 messages in dedicated file To avoid code duplication in the futur mux FCGI, functions parsing H1 messages and converting them into HTX have been moved in the file h1_htx.c. Some specific parts remain in the mux H1. But most of the parsing is now generic.	2019-09-17 10:18:54 +02:00
Christopher Faulet	341fac1eb2	MINOR: http: Add function to parse value of the header Status It will be used by the mux FCGI to get the status a response.	2019-09-17 10:18:54 +02:00
Christopher Faulet	5c6fefc8eb	MINOR: log: Provide a function to emit a log for an application Application is a generic term here. It is a modules which handle its own log server list, with no dependency on a proxy. Such applications can now call the function app_log() to log messages, passing a log server list and a tag as parameters. Internally, the function __send_log() has been adapted accordingly.	2019-09-17 10:18:54 +02:00
Christopher Faulet	130cf21709	MINOR: istbuf: Add the function b_isteqi() This function compares a part of a buffer to an indirect string (ist), ignoring the case of the characters.	2019-09-17 10:18:54 +02:00
Christopher Faulet	c16929658f	MINOR: config: Support per-proxy and per-server post-check functions callbacks Most of times, when a keyword is added in proxy section or on the server line, we need to have a post-parser callback to check the config validity for the proxy or the server which uses this keyword. It is possible to register a global post-parser callback. But all these callbacks need to loop on the proxies and servers to do their job. It is neither handy nor efficient. Instead, it is now possible to register per-proxy and per-server post-check callbacks.	2019-09-17 10:18:54 +02:00
Christopher Faulet	3ea5cbe6a4	MINOR: config: Support per-proxy and per-server deinit functions callbacks Most of times, when any allocation is done during configuration parsing because of a new keyword in proxy section or on the server line, we must add a call in the deinit() function to release allocated ressources. It is now possible to register a post-deinit callback because, at this stage, the proxies and the servers are already releases. Now, it is possible to register deinit callbacks per-proxy or per-server. These callbacks will be called for each proxy and server before releasing them.	2019-09-17 10:18:54 +02:00
Christopher Faulet	e3d2a877fb	MINOR: http-ana: Remove err_state field from http_msg This field is not used anymore. In addition, the state HTTP_MSG_ERROR is now only used when an error occurred during the body forward.	2019-09-17 10:18:54 +02:00
Christopher Faulet	505adfca51	MINOR: htx: Add a flag on HTX message to report processing errors This new flag may be used to report unexpected error because of not well formatted HTX messages (not related to a parsing error) or our incapactity to handle the processing because we reach a limit (ressource exhaustion, too big headers...). It should result to an error 500 returned to the client when applicable.	2019-09-17 10:18:54 +02:00
Christopher Faulet	6338a08c34	MINOR: stats: Add JSON export from the stats page It is now possible to export stats using the JSON format from the HTTP stats page. Like for the CSV export, to export stats in JSON, you must add the option ";json" on the stats URL. It is also possible to dump the JSON schema with the option ";json-schema". Corresponding Links have been added on the HTML page. This patch fixes the issue #263.	2019-09-10 10:29:54 +02:00
Willy Tarreau	f21d17bbe8	MINOR: stats: report the number of idle connections for each server This adds two extra fields to the stats, one for the current number of idle connections and one for the configured limit. A tooltip link now appears on the HTML page to show these values in front of the active connection values. This should be backported to 2.0 and 1.9 as it's the only way to monitor the idle connections behaviour.	2019-09-08 09:30:50 +02:00
Willy Tarreau	4cae3bf631	BUG/MEDIUM: connection: don't keep more idle connections than ever needed When using "http-reuse safe", which is the default, a new incoming connection does not automatically reuse an existing connection for the first request, as we don't want to risk to lose the contents if we know the client will not be able to replay the request. A side effect to this is that when dealing with mostly http-close traffic, the reuse rate is extremely low and we keep accumulating server-side connections that may even never be reused. At some point we're limited to a ratio of file descriptors, but when the system is configured with very high FD limits, we can still reach the limit of outgoing source ports and make the system significantly slow down trying to find an available port for outgoing connections. A simple test on my laptop with ulimit 100000 and with the following config results in the load immediately dropping after a few seconds : listen l1 bind :4445 mode http server s1 127.0.0.1:8000 As can be seen, the load falls from 38k cps to 400 cps during the first 200ms (in fact when the source port table is full and connect() takes ages to find a spare port for a new connection): $ injectl464 -p 4 -o 1 -u 10 -G 127.0.0.1:4445/ -F -c -w 100 hits ^hits hits/s ^h/s bytes kB/s last errs tout htime sdht ptime 2439 2439 39338 39338 356094 5743 5743 0 0 0.4 0.5 0.4 7637 5198 38185 37666 1115002 5575 5499 0 0 0.7 0.5 0.7 7719 82 25730 820 1127002 3756 120 0 0 21.8 18.8 21.8 7797 78 19492 780 1138446 2846 114 0 0 61.4 2.5 61.4 7877 80 15754 800 1150182 2300 117 0 0 58.6 0.5 58.6 7920 43 13200 430 1156488 1927 63 0 0 58.9 0.3 58.9 At this point, lots of connections are indeed in use, for only 10 connections on the frontend side: $ ss -ant state established \| wc -l 39022 This patch makes sure we never keep more idle connections than we've ever had outstanding requests on a server. This way the total number of idle connections will never exceed the sum of maximum connections. Thus highly loaded servers will be able to get many connections and slightly loaded servers will keep less. Ideally we should apply similar limits per process and the per backend, but in practice this already addresses the issues pretty well: $ injectl464 -p 4 -o 1 -u 10 -G 127.0.0.1:4445/ -F -c -w 100 hits ^hits hits/s ^h/s bytes kB/s last errs tout htime sdht ptime 4423 4423 40209 40209 645758 5870 5870 0 0 0.2 0.4 0.2 8020 3597 40100 39966 1170920 5854 5835 0 0 0.2 0.4 0.2 12037 4017 40123 40170 1757402 5858 5864 0 0 0.2 0.4 0.2 16069 4032 40172 40320 2346074 5865 5886 0 0 0.2 0.4 0.2 20047 3978 40013 39386 2926862 5842 5750 0 0 0.3 0.4 0.3 24005 3958 40008 39979 3504730 5841 5837 0 0 0.2 0.4 0.2 $ ss -ant state established \| wc -l 234 This patch must be backported to 2.0. It could be useful in 1.9 as well eventhough pools and reuse are not enabled by default there.	2019-09-08 09:30:50 +02:00
Willy Tarreau	6b3089856f	MEDIUM: fd: do not use the FD_POLL_* flags in the pollers anymore As mentioned in previous commit, these flags do not map well to modern poller capabilities. Let's use the FD_EV_*_{R,W} flags instead. This first patch only performs a 1-to-1 mapping making sure that the previously reported flags are still reported identically while using the closest possible semantics in the pollers. It's worth noting that kqueue will now support improvements such as returning distinctions between shut and errors on each direction, though this is not exploited for now.	2019-09-06 19:09:56 +02:00
Willy Tarreau	77abb43ed1	MINOR: fd: add two flags ERR and SHUT to describe FD states There's currently a big ambiguity on our use of POLLHUP because we currently map POLLHUP and POLLRDHUP to FD_POLL_HUP. The first one indicates a close in both directions while the second one indicates a unidirectional close. Since we don't know from the resulting flag we always have to read when reported. Furthermore kqueue only reports unidirectional responses which are mapped to FD_POLL_HUP as well, and their write closes are mapped to a general error. We could add a new FD_POLL_RDHUP flag to improve the mapping, or switch only to the POLL* flags, but that further complicates the portability for operating systems like FreeBSD which do not have POLLRDHUP but have its semantics. Let's instead directly use the per-direction flag values we already have, and it will be a first step in the direction of finer states. Thus we introduce an ERR and a SHUT status for each direction, that the pollers will be able to compute and pass to fd_update_events(). It's worth noting that FD_EV_STATUS already sees the two new flags, but they are harmless since used only by fd_{recv,send}_state() which are never called. Thus in its current state this patch must be totally transparent.	2019-09-06 18:33:07 +02:00
Willy Tarreau	8f2825f3ab	MINOR: fd: add two new calls fd_cond_{recv,send}() These two functions are used to enable recv/send but only if the FD is not marked as active yet. The purpose is to conditionally mark them as tentatively usable without interfering with the polling if polling was already enabled, when it's supposed to be likely true.	2019-09-06 17:50:36 +02:00
Willy Tarreau	4ac9d064d2	MEDIUM: fd: mark the FD as ready when it's inserted Given that all our I/Os are now directed from top to bottom and not the opposite way around, and the FD cache was removed, it doesn't make sense anymore to create FDs that are marked not ready since this would prevent the first accesses unless the caller explicitly does an fd_may_recv() which is not expected to be its job (which conn_ctrl_init() has to do by the way). Let's move this into fd_insert() instead, and have a single atomic operation for both directions via fd_may_both().	2019-09-06 17:50:36 +02:00
Willy Tarreau	dbe3060e81	MINOR: fd: make updt_fd_polling() a normal function It's called from many places, better use a real function than an inline.	2019-09-05 09:31:18 +02:00
Willy Tarreau	f8ecc7f667	MEDIUM: fd: simplify the fd__{recv,send} functions using BTS/BTR Now that we don't have to update FD_EV_POLLED_ at the same time as FD_EV_ACTIVE_*, we don't need to use a CAS anymore, a bit-test-and-set operation is enough. Doing so reduces the code size by a bit more than 1 kB. One function was special, fd_done_recv(), whose comments and doc were inaccurate for the part related to the lack of polling.	2019-09-05 09:31:18 +02:00
Willy Tarreau	5bee3e2f47	MEDIUM: fd: remove the FD_EV_POLLED status bit Since commit `7ac0e35f2` in 1.9-dev1 ("MAJOR: fd: compute the new fd polling state out of the fd lock") we've started to update the FD POLLED bit a bit more aggressively. Lately with the removal of the FD cache, this bit is always equal to the ACTIVE bit. There's no point continuing to watch it and update it anymore, all it does is create confusion and complicate the code. One interesting side effect is that it now becomes visible that all fd_*_{send,recv}() operations systematically call updt_fd_polling(), except fd_cant_recv()/fd_cant_send() which never saw it change.	2019-09-05 09:31:18 +02:00
Willy Tarreau	c046d167e4	MEDIUM: log: add support for logging to a ring buffer Now by prefixing a log server with "ring@<name>" it's possible to send the logs to a ring buffer. One nice thing is that it allows multiple sessions to consult the logs in real time in parallel over the CLI, and without requiring file system access. At the moment, ring0 is created as a default sink for tracing purposes and is available. No option is provided to create new rings though this is trivial to add to the global section.	2019-08-30 15:24:59 +02:00
Willy Tarreau	f3dc30f6de	MINOR: log: add a target type instead of hacking the address family Instead of detecting an AF_UNSPEC address family for a log server and to deduce a file descriptor, let's create a target type field and explicitly mention that the socket is of type FD.	2019-08-30 15:07:25 +02:00
Willy Tarreau	d660990cee	MINOR: fd: add a new "initialized" bit in the fdtab struct The purpose is to be able to remember that initialization was already done for a file descriptor. This will allow to get rid of some dirty hacks performed in the logs or fd sinks where the init state of the fd has to be guessed.	2019-08-30 15:07:25 +02:00
Willy Tarreau	76913d3ef4	CLEANUP: fd: remove leftovers of the fdcache The "cache" entry was still present in the fdtab struct and it was reported in "show sess". Removing it broke the cache-line alignment on 64-bit machines which is important for threads, so it was fixed by adding an attribute(aligned()) when threads are in use. Doing it only in this case allows 32-bit thread-less platforms to see the struct fit into 32 bytes.	2019-08-30 15:07:25 +02:00
Willy Tarreau	1d181e489c	MEDIUM: ring: implement a wait mode for watchers Now it is possible for a reader to subscribe and wait for new events sent to a ring buffer. When new events are written to a ring buffer, the applets that are subscribed are woken up to display new events. For now we only support this with the CLI applet called by "show events" since the I/O handler is indeed a CLI I/O handler. But it's not complicated to add other mechanisms to consume events and forward them to external log servers for example. The wait mode is enabled by adding "-w" after "show events <sink>". An extra "-n" was added to directly seek to new events only.	2019-08-30 11:58:58 +02:00
Willy Tarreau	300decc8d9	MINOR: cli: extend the CLI context with a list and two offsets Some CLI parsers are currently abusing the CLI context types such as pointers to stuff longs into them by lack of room. But the context is 80 bytes while cli is only 48, thus there's some room left. This patch adds a list element and two size_t usable as various offsets. The list element is initialized.	2019-08-30 11:58:58 +02:00
Willy Tarreau	370a694879	MINOR: trace: change the detail_level to per-source verbosity The detail level initially based on syslog levels is not used, while something related is missing, trace verbosity, to indicate whether or not we want to call the decoding callback and what level of decoding we want (raw captures etc). Let's change the field to "verbosity" for this. A verbosity of zero means that the decoding callback is not called, and all other levels are handled by this callback and are source-specific. The source is now prompted to list the levels that are proposed to the user. When the source doesn't define anything, "quiet" and "default" are available.	2019-08-29 17:11:25 +02:00
Willy Tarreau	09fb0df6fd	MINOR: trace: prepend the function name for developer level traces Working on adding traces to mux-h2 revealed that the function names are manually copied a lot in developer traces. The reason is that they are not preprocessor macros and as such cannot be concatenated. Let's slightly adjust the trace() function call to take a function name just after the file:line argument. This argument is only added for the TRACE_DEVEL and 3 new TRACE_ENTER, TRACE_LEAVE, and TRACE_POINT macros and left NULL for others. This way the function name is only reported for traces aimed at the developers. The pretty-print callback was also extended to benefit from this. This will also significantly shrink the data segment as the "entering" and "leaving" strings will now be merged. One technical point worth mentioning is that the function name is not passed as an ist to the inline function because it's not considered as a builtin constant by the compiler, and would lead to strlen() being run on it from all call places before calling the inline function. Thus instead we pass the const char * (that the compiler knows where to find) and it's the __trace() function that converts it to an ist for internal consumption and for the pretty-print callback. Doing this avoids losing 5-10% peak performance.	2019-08-29 17:09:13 +02:00
Willy Tarreau	2ea549bc43	MINOR: trace: change the "payload" level to "data" and move it The "payload" trace level was ambigous because its initial purpose was to be able to dump received data. But it doesn't make sense to force to report data transfers just to be able to report state changes. For example, all snd_buf()/rcv_buf() operations coming from the application layer should be tagged at this level. So here we move this payload level above the state transitions and rename it to avoid the ambiguity making one think it's only about request/response payload. Now it clearly is about any data transfer and is thus just below the developer level. The help messages on the CLI and the doc were slightly reworded to help remove this ambiguity.	2019-08-29 10:46:11 +02:00
Willy Tarreau	be5a288424	MINOR: trace: replace struct trace_lockon_args with struct name_desc No need for a specific struct anymore, name_desc suits us.	2019-08-29 09:34:53 +02:00
Willy Tarreau	fb4ba91ac1	MINOR: tools: add a generic struct "name_desc" for name-description pairs In prompts on the CLI we now commonly need to propose a keyword name and a description and it doesn't make sense to define a new struct for each such pairs. Let's simply have a generic "name_desc" for this.	2019-08-29 09:34:53 +02:00
Geoff Simmons	7185b789f9	MINOR: connection: add the fc_pp_authority fetch -- authority TLV, from PROXYv2 Save the authority TLV in a PROXYv2 header from the client connection, if present, and make it available as fc_pp_authority. The fetch can be used, for example, to set the SNI for a backend TLS connection.	2019-08-28 17:16:20 +02:00
Willy Tarreau	c326ecc9b1	MINOR: trace: change the TRACE() calling convention to put the args and cb last Previously the callback was almost mandatory so it made sense to have it before the message. Now that it can default to the one declared in the trace source, most TRACE() calls contain series of empty args and callbacks, which make them suitable for being at the end and being totally omitted. This patch thus reverses the TRACE arguments so that the message appears first, then the mask, then arg1..arg4, then the callback. In practice we'll mostly see 1 arg, or 2 args and nothing else, and it will not be needed anymore to pass long series of commas in the middle of the arguments. However if a source is enforced, the empty commas will still be needed for all omitted arguments.	2019-08-28 10:39:43 +02:00
Willy Tarreau	3da0026d25	MINOR: trace: support a default callback for the source It becomes apparent that most traces will use a single trace pretty print callback, so let's allow the trace source to declare a default one so that it can be omitted from trace calls, and will be used if no other one is specified.	2019-08-28 07:06:23 +02:00
Willy Tarreau	8f24023ba0	MINOR: sink: now report the number of dropped events on output The principle is that when emitting a message, if some dropped events were logged, we first attempt to report this counter before going further. This is done under an exclusive lock while all logs are produced under a shared lock. This ensures that the dropped line is accurately reported and doesn't accidently arrive after a later event.	2019-08-27 17:14:19 +02:00
Willy Tarreau	4ed23ca0e7	MINOR: sink: add support for ring buffers This now provides sink_new_buf() which allocates a ring buffer. One such ring ("buf0") of 1 MB is created already, and may be used by sink_write(). The sink's creation should probably be moved somewhere else later.	2019-08-27 17:14:19 +02:00
Willy Tarreau	072931cdcb	MINOR: ring: add a generic CLI io_handler to dump a ring buffer The three functions (attach, IO handler, and release) are meant to be called by any CLI command which requires to dump the contents of a ring buffer. We do not implement anything generic to dump any ring buffer on the CLI since it's meant to be used by other functionalities above. However these functions deal with locking and everything so it's trivial to embed them in other code.	2019-08-27 17:14:19 +02:00
Willy Tarreau	be97853c2f	MINOR: ring: add a ring_write() function This function tries to write to the ring buffer, possibly removing enough old messages to make room for the new one. It takes two arrays of fragments on input to ease the insertion of prefixes by the caller. It atomically writes the message, possibly truncating it if desired, and returns the operation's status.	2019-08-27 17:14:19 +02:00
Willy Tarreau	172945fbad	MINOR: ring: add a new mechanism for retrieving/storing ring data in buffers Our circular buffers are well suited for being used as ring buffers for not-so-structured data. The machanism here consists in making room in a buffer before inserting a new record which is prefixed by its size, and looking up next record based on the previous one's offset and size. We can have up to 255 consumers watching for data (dump in progress, tail) which guarantee that entrees are not recycled while they're being dumped. The complete representation is described in the header file. For now only ring_new(), ring_resize() and ring_free() are created.	2019-08-27 17:14:19 +02:00
Willy Tarreau	931d8b79a8	MINOR: fd: add fd_write_frag_line() to send a fragmented line to an fd Currently both logs and event sinks may use a file descriptor to atomically emit some output contents. The two may use the same FD though nothing is done to make sure they use the same lock. Also there is quite some redundancy between the two. Better make a specific function to send a fragmented message to a file descriptor which will take care of the locking via the fd's lock. The function is also able to truncate a message and to enforce addition of a trailing LF when building the output message.	2019-08-27 17:14:19 +02:00
Willy Tarreau	b88d231773	MINOR: buffer: add functions to read/write varints from/to buffers The new functions are : __b_put_varint() : inserts a varint when it's known that it fits b_put_varint() : tries to insert a varint at the tail b_get_varint() : tries to get a varint from the head b_peek_varint() : tries to peek a varint at a specific offset Wrapping is supported so that they are expected to be safe to use to manipulate varints with buffers anywhere.	2019-08-27 17:14:19 +02:00
Willy Tarreau	4d589e719b	MINOR: tools: add a function varint_bytes() to report the size of a varint It will sometimes be useful to encode varints to know the output size in advance. Two versions are provided, one inline using a switch/case construct which will be trivial for use with constants (and will be very fast albeit huge) and one function iterating on the number which is 5 times smaller, for use with variables.	2019-08-27 17:14:19 +02:00
Willy Tarreau	e40f274878	BUILD: trace: make the lockon_ptr const to silence a warning without threads I forgot to fix this one before pushing, despite my tests. lockon_ptr is only used to compare pointers, it doesn't need to point to a writable location. Without threads the atomic store is turned into an assignment and rightfully complains.	2019-08-22 20:26:28 +02:00
Willy Tarreau	c14eea49e6	MINOR: trace: add the possibility to lock on some arguments Given that we can pass typed arguments to the trace() function, let's add provisions for tracking them. They are source-specific so we need to let the source fill their name and description. Only those with a non-null name will be proposed.	2019-08-22 20:21:00 +02:00
Willy Tarreau	17a51c64b5	MINOR: trace: add a definition of typed arguments to trace() With a few macros it's possible for a trace source to commit to only using a certain type for a given argument (or set of). This will be particularly useful to let the trace subsystem retrieve some precious information such as a connection, session, listener, source address or so, and enable/disable filtering and/or locking.	2019-08-22 20:21:00 +02:00
Willy Tarreau	4ab242136d	MINOR: trace: add per-level macros to produce traces The new TRACE_<level>() macros take a mask, 4 args, a callback and a static message. From this they also inherit the TRACE_SOURCE macro from the caller, which contains the pointer to the trace source (so that it's not required to paste it everywhere), and an ist string is also made by the concatenation of the file name and the line number. This uses string concatenation by the preprocessor, and turns it into an ist by the compiler so that there is no operation at all to perform to adjust the data length as the compiler knows where to cut during the optimization phase. Last, the message is also automatically turned into an ist so that it's trivial to put it into an iovec without having to run strlen() on it. All arguments and the callback may be empty and will then automatically be replaced with a NULL pointer. This makes the TRACE calls slightly lighter especially since arguments are not always used. Several other options were considered to use variadic macros but there's no outstanding rule that justifies to place an argument before another one, and it still looks convenient to have the message be the last one to encourage copy- pasting of the trace statements. A generic TRACE() macro takes TRACE_LEVEL in from the source file as the trace level instead of taking it from its name. This may slightly simplify the production of traces that always run at the same level (internal core parts may probably only be called at developer level).	2019-08-22 20:21:00 +02:00
Willy Tarreau	bfd14fc6eb	MINOR: trace: implement a call to a decode function The trace() call will support an optional decoding callback and 4 arguments that this function is supposed to know how to use to provide extra information. The output remains unchanged when the function is NULL. Otherwise, the message is pre-filled into the thread-local trace_buf, and the function is called with all arguments so that it completes the buffer in a readable form depending on the expected level of detail.	2019-08-22 20:21:00 +02:00
Willy Tarreau	5da408818b	MINOR: trace: make trace() now also take a level in argument This new "level" argument will allow the trace sources to label the traces for different purposes, and filter out some of them if they are not relevant to the current target. Right now we have 5 different levels: - USER : the least verbose one, only a few functional information - PAYLOAD: like user but also displays some payload-related information - PROTO: focuses on the protocol's framing - STATE: also indicate state internal transitions or non-transitions - DEVELOPER: adds extra info about branches taken in the code (break points, return points)	2019-08-22 20:21:00 +02:00
Willy Tarreau	419bd49f0b	MINOR: trace: add the file name and line number in the prefix We now pass an extra argument "where" to the trace() call, which is supposed to be an ist made of the concatenation of the filename and the line number. We only keep the last 10 chars from this string since the end of file names is most often easy to recognize. This gives developers useful information at very low cost.	2019-08-22 20:21:00 +02:00
Willy Tarreau	4c2ae48375	MINOR: trace: implement a very basic trace() function For now it remains quite basic. It performs a few state checks, calls the source's sink if defined, and performs the transitions between RUNNING, STOPPED and WAITING when the configured events match.	2019-08-22 20:21:00 +02:00
Willy Tarreau	864e880f6c	MINOR: trace/cli: register the "trace" CLI keyword to list the sources For now it lists the sources if one is not provided, and checks for the source's existence. It lists the events if not provided, checks for their existence if provided, and adjusts reported events/start/stop/pause events, and performs state transitions. It lists sinks and adjusts them as well. Filters, lock, and level are not implemented yet.	2019-08-22 20:21:00 +02:00
Willy Tarreau	88ebd4050e	MINOR: trace: add allocation of buffer-sized trace buffers This will be needed so that we can implement protocol decoders which will have to emit their contents into such a buffer.	2019-08-22 20:21:00 +02:00
Willy Tarreau	4151c753fc	MINOR: trace: start to create a new trace subsystem The principle of this subsystem will be to support taking live traces at various places in the code with conditional triggers, filters, and ability to lock on some elements. The traces will support typed events and will be sent into sinks made of ring buffers, file descriptors or remote servers.	2019-08-22 20:21:00 +02:00
Willy Tarreau	973e662fe8	MINOR: sink: add a support for file descriptors This is the most basic type of sink. It pre-registers "stdout" and "stderr", and is able to use writev() on them. The writev() operation is locked to avoid mixing outputs. It's likely that the registration should move somewhere else to take into account the fact that stdout and stderr are still opened or are closed.	2019-08-22 20:21:00 +02:00
Willy Tarreau	67b5a161b4	MINOR: sink: create definitions a minimal code for event sinks The principle will be to be able to dispatch events to various destinations called "sinks". This is already done in part in logs where log servers can be either a UDP socket or a file descriptor. This will be needed with the new trace subsystem where we may also want to add ring buffers. And it turns out that all such destinations make sense at all places. Logs may need to be sent to a TCP server via a ring buffer, or consulted from the CLI. Trace events may need to be sent to stdout/stderr as well as to remote log servers. This patch creates a new structure "sink" aiming at addressing these similar needs. The goal is to merge together what is common to all of them, such as the output format, the dropped events count, etc, and also keep separately the target identification (network address, file descriptor). Provisions were made to have a "waiter" on the sink. For a TCP log server it will be the task to wake up after writing to the log buffer. For a ring buffer, it could be the list of watchers on the CLI running a "tail" operation and waiting for new events. A lock was also placed in the struct since many operations will require some locking, including the FD ones. The output formats covers those in use by logs and two extra ones prepending the ISO time in front of the message (convenient for stdio/buffer). For now only the generic infrastructure is present, no type-specific output is implemented. There's the sink_write() function which prepares and formats a message to be sent, trying hard to avoid copies and only using pointer manipulation, where the type-specific code just has to be added. Dropped messages are already counted (for now 100% drop). The message is put into an iovec array as it will be trivial to use with file descriptors and sockets.	2019-08-22 20:21:00 +02:00
Willy Tarreau	9eebd8a978	REORG: trace: rename trace.c to calltrace.c and mention it's not thread-safe The function call tracing code is a quite old and was never ported to support threads. It's not even sure whether it still works well, but at least its presence creates confusion for future work so let's rename it to calltrace.c and add a comment about its lack of thread-safety.	2019-08-22 20:21:00 +02:00
Willy Tarreau	32c24552e4	MINOR: tools: add a DEFNULL() macro to use NULL for empty args It's sometimes convenient for debugging macros not to be forced to explicitly pass NULL in an unused argument. This macro does this, it replaces a missing arg with NULL.	2019-08-22 20:21:00 +02:00
Willy Tarreau	9bead8c7f5	MINOR: list: add LIST_SPLICE() to merge one list into another This will move the contents of list <old> at the beginning of list <new>.	2019-08-22 20:21:00 +02:00
Willy Tarreau	60409db0b1	MINOR: lua: export applet and task handlers The current functions are seen outside from the debugging code and are convenient to export so that we can improve the thread dump output : void hlua_applet_tcp_fct(struct appctx ctx); void hlua_applet_http_fct(struct appctx ctx); struct task hlua_process_task(struct task task, void *context, unsigned short state); Of course they are only available when USE_LUA is defined.	2019-08-21 14:32:09 +02:00
Willy Tarreau	a2c9911ace	MINOR: tools: add append_prefixed_str() This is somewhat related to indent_msg() except that this one places a known prefix at the beginning of each line, allows to replace the EOL character, and not to insert a prefix on the first line if not desired. It works with a normal output buffer/chunk so it doesn't need to allocate anything nor to modify the input string. It is suitable for use in multi- line backtraces.	2019-08-21 14:32:09 +02:00
Willy Tarreau	f5cab82025	MINOR: fd: make sure to mark the thread as not stuck in fd_update_events() When I/O events are being processed, we want to make sure to mark the thread as not stuck. The reason is that some pollers (like poll()) which do not limit the number of FDs they report could possibly report a huge amount of FD all having to perform moderately expensive operations in the I/O callback (e.g. via mux-pt which forwards to the upper layers), making the watchdog think the thread is stuck since it does not schedule. Of course this must never happen but if it ever does we must be liberal about it. This should be backported to 2.0, where the situation may happen more easily due to the FD cache which can start to collect a large amount of events. It may be related to the report in issue #201 though nothing is certain about it.	2019-08-16 16:06:14 +02:00
Willy Tarreau	edb91ad647	MINOR: cli: add cli_msg(), cli_err(), cli_dynmsg(), cli_dynerr() These functions perform all the boring filling of the appctx's cli struct needed by CLI parsers to return a message or an error, and they return 1 so that they can be used as a single-line return statement. They may be used for const messages or dynamic messages.	2019-08-09 10:11:38 +02:00
Willy Tarreau	d50c7feaa1	MINOR: cli: add two new states to print messages on the CLI Right now we used to have extremely inconsistent states to report output, one is CLI_ST_PRINT which prints constant message cli->msg with the assigned severity, and CLI_ST_PRINT_FREE which prints dynamically allocated cli->err with severity LOG_ERR, and nothing in between, eventhough it's useful to be able to report dynamically allocated messages as well as constant error messages. This patch adds two extra states, which are not particularly well named given the constraints imposed by existing ones. One is CLI_ST_PRINT_ERR which prints a constant error message. The other one is CLI_ST_PRINT_DYN which prints a dynamically allocated message. By doing so we maintain the compatibility with current code. It is important to keep in mind that we cannot pre-initialize pointers and automatically detect what message type it is based on the assigned fields, because the CLI's context is in a union shared with all other users, thus unused fields contain anything upon return. This is why we have no choice but using 4 states. Keeping the two fields <msg> and <err> remains useful because one is const and not the other one, and this catches may copy-paste mistakes. It's just that <err> is pretty confusing here, it should be renamed.	2019-08-09 10:11:38 +02:00
Willy Tarreau	247a8b1d81	CLEANUP: task: move the cpu_time field to the task-only part The CPU time accounting field called "cpu_time" is used only by tasks and not tasklets, yet it used to be stored into the TASK_COMMON part, which doesn't make sense and wastes tasklet memory. In addition, moving it to tasks also helps better group the various parts in cache lines.	2019-08-08 10:11:05 +02:00
Willy Tarreau	e0d0b4089d	CLEANUP: buffer: replace b_drop() with b_free() Since last commit there's no point anymore in having two variants of the same function, let's switch to b_free() only. __b_drop() was renamed to __b_free() for obvious consistency reasons.	2019-08-08 08:07:45 +02:00
Willy Tarreau	3b091f80aa	BUG/MINOR: buffers/threads: always clear a buffer's head before releasing it A small race exists in buffers with "show sess all". This one wants to show some information grabbed from the buffer (especially in HTX mode). But the thread owning this buffer might just be releasing its area, right after a free() or munmap() call, resulting in a head that is not seen as empty yet though the area was released. It may then be dereferenced by "show sess all" causing a crash. Note that in practice it only happens in debug mode with UAF enabled, but it's tricky enough to fix it right now. This should be backported to stable versions which support threads and a store barrier. It's worth noting that by performing the clearing first, b_free() and b_drop() now become two exact equivalent.	2019-08-08 08:07:45 +02:00
Willy Tarreau	229e739c21	BUG/MINOR: pools: don't mark the thread harmless if already isolated Commit `85b2cae63` ("MINOR: pools: make the thread harmless during the mmap/munmap syscalls") was used to relax the pressure experienced by other threads when running in debug mode with UAF enabled. It places a pair of thread_harmless_now()/thread_harmless_end() around the call to mmap(), assuming callers are not sensitive to parallel activity. But there are a few cases like "show sess all" where this happens in isolated threads, and marking the thread as harmless there is a very bad idea, even worse when arriving to thread_harmless_end() which loops forever. Let's only do that when the thread is not isolated. No backport is needed as the patch above was only in 2.1-dev.	2019-08-08 07:41:52 +02:00
Fr�d�ric L�caille	be36793d1d	BUG/MEDIUM: stick-table: Wrong stick-table backends parsing. When parsing references to stick-tables declared as backends, they are added to a list of proxies (they are proxies!) which refer to this stick-tables. Before this patch we added them to these list without checking they were already present, making the silly hypothesis the actions/sample were checked/resolved in the same order the proxies are parsed. This patch implement a simple inline function to in_proxies_list() to test the presence of a proxy in a list of proxies. We use this function when resolving /checking samples/actions. This bug was introduced by `015e4d7` commit. Must be backported to 2.0.	2019-08-07 10:32:31 +02:00
Olivier Houchard	4c18f94c11	BUG/MEDIUM: proxy: Make sure to destroy the stream on upgrade from TCP to H2 In stream_set_backend(), if we have a TCP stream, and we want to upgrade it to H2 instead of attempting ot reuse the stream, just destroy the conn_stream, make sure we don't log anything about the stream, and pretend we failed setting the backend, so that the stream will get destroyed. New streams will then be created by the mux, as if the connection just happened. This fixes a crash when upgrading from TCP to H2, as the H2 mux totally ignored the conn_stream provided by the upgrade, as reported in github issue #196. This should be backported to 2.0.	2019-08-02 18:28:58 +02:00
Emmanuel Hocdet	f580d0f391	BUILD: ssl: BoringSSL add EVP_PKEY_base_id Remove EVP_PKEY_base_id compatibility, it is now included in BoringSSL.	2019-08-01 11:21:42 +02:00
Willy Tarreau	a37cb1880c	MINOR: wdt: also consider that waiting in the thread dumper is normal It happens that upon looping threads the watchdog fires, starts a dump, and other threads expire their budget while waiting for the other threads to get dumped and trigger a watchdog event again, adding some confusion to the traces. With this patch the situation becomes clearer as we export the list of threads being dumped so that the watchdog can check it before deciding to trigger. This way such threads in queue for being dumped are not attempted to be reported in turn. This should be backported to 2.0 as it helps understand stack traces.	2019-07-31 19:35:31 +02:00
Olivier Houchard	53055055c5	MEDIUM: pollers: Remember the state for read and write for each threads. In the poller code, instead of just remembering if we're currently polling a fd or not, remember if we're polling it for writing and/or for reading, that way, we can avoid to modify the polling if it's already polled as needed.	2019-07-31 14:54:41 +02:00
Olivier Houchard	305d5ab469	MAJOR: fd: Get rid of the fd cache. Now that the architecture was changed so that attempts to receive/send data always come from the upper layers, instead of them only trying to do so when the lower layer let them know they could try, we can finally get rid of the fd cache. We don't really need it anymore, and removing it gives us a small performance boost.	2019-07-31 14:12:55 +02:00
Willy Tarreau	5e83d996cf	BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue() A problem involving server slowstart was reported by @max2k1 in issue #197. The problem is that pendconn_grab_from_px() takes the proxy lock while already under the server's lock while process_srv_queue() first takes the proxy's lock then the server's lock. While the latter seems more natural, it is fundamentally incompatible with mayn other operations performed on servers, namely state change propagation, where the proxy is only known after the server and cannot be locked around the servers. Howwever reversing the lock in process_srv_queue() is trivial and only the few functions related to dynamic cookies need to be adjusted for this so that the proxy's lock is taken for each server operation. This is possible because the proxy's server list is built once at boot time and remains stable. So this is what this patch does. The comments in the proxy and server structs were updated to mention this rule that the server's lock may not be taken under the proxy's lock but may enclose it. Another approach could consist in using a second lock for the proxy's queue which would be different from the regular proxy's lock, but given that the operations above are rare and operate on small servers list, there is no reason for overdesigning a solution. This fix was successfully tested with 10000 servers in a backend where adjusting the dyncookies in loops over the CLI didn't have a measurable impact on the traffic. The only workaround without the fix is to disable any occurrence of "slowstart" on server lines, or to disable threads using "nbthread 1". This must be backported as far as 1.8.	2019-07-30 14:02:06 +02:00
Christopher Faulet	bfab2dddad	MINOR: hlua: Add a flag on the lua txn to know in which context it can be used When a lua action or a lua sample fetch is called, a lua transaction is created. It is an entry in the stack containing the class TXN. Thanks to it, we can know the direction (request or response) of the call. But, for some functions, it is also necessary to know if the buffer is "HTTP ready" for the given direction. "HTTP ready" means there is a valid HTTP message in the channel's buffer. So, when a lua action or a lua sample fetch is called, the flag HLUA_TXN_HTTP_RDY is set if it is appropriate.	2019-07-29 11:17:52 +02:00
Willy Tarreau	d6e0c03384	BUILD: threads: add the definition of PROTO_LOCK This one was added by commit `daacf3664` ("BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff") but I forgot to add it to the include file, breaking DEBUG_THREAD.	2019-07-25 07:53:56 +02:00
Christopher Faulet	98fbe9531a	MEDIUM: mux-h1: Add the support of headers adjustment for bogus HTTP/1 apps There is no standard case for HTTP header names because, as stated in the RFC7230, they are case-insensitive. So applications must handle them in a case-insensitive manner. But some bogus applications erroneously rely on the case used by most browsers. This problem becomes critical with HTTP/2 because all header names must be exchanged in lowercase. And HAProxy uses the same convention. All header names are sent in lowercase to clients and servers, regardless of the HTTP version. This design choice is linked to the HTX implementation. So, for previous versions (2.0 and 1.9), a workaround is to disable the HTX mode to fall back to the legacy HTTP mode. Since the legacy HTTP mode was removed, some users reported interoperability issues because their application was not able anymore to handle HTTP/1 message received from HAProxy. So, we've decided to add a way to change the case of some headers before sending them. It is now possible to define a "mapping" between a lowercase header name and a version supported by the bogus application. To do so, you must use the global directives "h1-case-adjust" and "h1-case-adjust-file". Then options "h1-case-adjust-bogus-client" and "h1-case-adjust-bogus-server" may be used in proxy sections to enable the conversion. See the configuration manual for more info. Of course, our advice is to urgently upgrade these applications for interoperability concerns and because they may be vulnerable to various types of content smuggling attacks. But, if your are really forced to use an unmaintained bogus application, you may use these directive, at your own risks. If it is relevant, this feature may be backported to 2.0.	2019-07-24 18:32:47 +02:00
Willy Tarreau	daacf36645	BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff Dragan Dosen found that the listeners lock is not sufficient to protect the listeners list when proxies are stopping because the listeners are also unlinked from the protocol list, and under certain situations like bombing with soft-stop signals or shutting down many frontends in parallel from multiple CLI connections, it could be possible to provoke multiple instances of delete_listener() to be called in parallel for different listeners, thus corrupting the protocol lists. Such operations are pretty rare, they are performed once per proxy upon startup and once per proxy on shut down. Thus there is no point trying to optimize anything and we can use a global lock to protect the protocol lists during these manipulations. This fix (or a variant) will have to be backported as far as 1.8.	2019-07-24 16:45:02 +02:00
Christopher Faulet	90cc4811be	BUG/MINOR: http_htx: Support empty errorfiles Empty error files may be used to disable the sending of any message for specific error codes. A common use-case is to use the file "/dev/null". This way the default error message is overridden and no message is returned to the client. It was supported in the legacy HTTP mode, but not in HTX. Because of a bug, such messages triggered an error. This patch must be backported to 2.0 and 1.9. However, the patch will have to be adapted.	2019-07-23 14:58:32 +02:00
Willy Tarreau	1c8d32bb62	MAJOR: stream: store the target address into s->target_addr When forcing the outgoing address of a connection, till now we used to allocate this outgoing connection and set the address into it, then set SF_ADDR_SET. With connection reuse this causes a whole lot of issues and difficulties in the code. Thanks to the previous changes, it is now possible to store the target address into the stream instead, and copy the address from the stream to the connection when initializing the connection. assign_server_address() does this and as a result SF_ADDR_SET now reflects the presence of the target address in the stream, not in the connection. The http_proxy mode, the peers and the master's CLI now use the same mechanism. For now the existing connection code was not removed to limit the amount of tricky changes, but the allocated connection is not used anymore. This change also revealed a latent issue that we've been having around option http_proxy : the address was set in the connection but neither the SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection could establish only due to the fact that it existed with a non-null destination address.	2019-07-19 13:50:09 +02:00
Willy Tarreau	9042060b0b	MINOR: stream: add a new target_addr entry in the stream structure The purpose will be to store the target address there and not to allocate a connection just for this anymore. For now it's only placed in the struct, a few fields were moved to plug some holes, and the entry is freed on release (never allocated yet for now). This must have no impact. Note that in order to fit, the store_count which previously was an int was turned into a short, which is way more than enough given that the hard-coded limit is 8.	2019-07-19 13:50:09 +02:00
Willy Tarreau	e71fca81dd	MAJOR: connection: remove the addr field Now addresses are dynamically allocated when needed. Each connection is created with src=dst=NULL, these entries are allocated on the fly, and released when the connection is released.	2019-07-19 13:50:09 +02:00
Willy Tarreau	ca79f59365	MEDIUM: connection: make sure all address producers allocate their address This commit places calls to sockaddr_alloc() at the places where an address is needed, and makes sure that the allocation is properly tested. This does not add too many error paths since connection allocations are already in the vicinity and share the same error paths. For the two cases where a clear_addr() was called, instead the address was not allocated.	2019-07-19 13:50:09 +02:00
Willy Tarreau	ff5d57b022	MINOR: connection: create a new pool for struct sockaddr_storage This pool will be used to allocate storage for source and destination addresses used in connections. Two functions sockaddr_{alloc,free}() were added and will have to be used everywhere an address is needed. These ones are safe for progressive replacement as they check that the existing pointer is set before replacing it. The pool is not yet used during allocation nor freeing. Also they operate on pointers to pointers so they will perform checks and replace values. The free one nulls the pointer.	2019-07-19 13:50:09 +02:00
Willy Tarreau	226572f55f	MINOR: connection: use conn->{src,dst} instead of &conn->addr.{from,to} This is in preparation for the switch to dynamic address allocation, let's migrate the code using the old fields to the pointers instead. Note that no extra check was added for now, the purpose is only to get the code to use the pointers and still work. In the proxy protocol message handling we make sure the addresses are properly allocated before declaring them unset.	2019-07-19 13:50:09 +02:00
Willy Tarreau	1ef4cbc693	MINOR: connection: add new src and dst fields At the moment we're facing difficulties with connection reuse based on the fact that connections may be allocated very early only to set a target address in transparent mode. With the imminent removal of the legacy mode, the connection reuse by a same stream will not exist anymore and all this awful complexity is not justified anymore. However we still need to be able to assign addresses somewhere. Thus instead of allocating a connection, we'll only place addresses where needed in the stream during operations. But this takes quite some room (typically 128 bytes). This is a nice opportunity for cleaning all this up and dynamically allocatating the addresses fields, which will result in actually saving memory from connection structs since most of the time the client's "to" address is not used and the server's "from" is not used either, thus saving ~256 bytes per end-to-end connection. For now these new "src" and "dst" pointers point to addr.from and addr.to. This will allow us to smoothly update the whole code to use these pointers prior to going further and switching them to pools.	2019-07-19 13:50:09 +02:00
Willy Tarreau	cc4df3b3de	CLEANUP: connection: remove the now unused conn_get_{from,to}_addr() These functions are not used anymore. They didn't report failures and as such were often misused. conn_get_src() and conn_get_dst() now replaced them everywhere.	2019-07-19 13:50:09 +02:00
Willy Tarreau	3cc01d84b3	MINOR: backend: switch to conn_get_{src,dst}() for port and address mapping The backend connect code uses conn_get_{from,to}_addr to forward addresses in transparent mode and to map server ports, without really checking if the operation succeeds. In preparation of future changes, let's switch to conn_get_{src,dst}() and integrate status check for possible failures.	2019-07-19 13:50:09 +02:00
Willy Tarreau	2e34c11458	MINOR: connection: add conn_get_src() and conn_get_dst() These functions currently are the same as conn_get_from_addr() and conn_get_to_addr() respectively except that they return a status for the operation that the caller can test.	2019-07-19 13:50:09 +02:00
Christopher Faulet	f734638976	MINOR: http: Don't store raw HTTP errors in chunks anymore Default HTTP error messages are stored in an array of chunks. And since the HTX was added, these messages are also converted in HTX and stored in another array. But now, the first array is not used anymore because the legacy HTTP mode was removed. So now, only the array with the HTX messages are kept. The other one was removed.	2019-07-19 09:46:23 +02:00
Christopher Faulet	1b6adb4a51	MINOR: proxy/http_ana: Remove unused req_exp/rsp_exp and req_add/rsp_add lists The keywords req* and rsp* are now unsupported. So the corresponding lists are now unused. It is safe to remove them from the structure proxy. As a result, the code dealing with these rules in HTTP analyzers was also removed.	2019-07-19 09:24:12 +02:00
Christopher Faulet	8c3b63ae1d	MINOR: proxy: Remove the unused list of block rules The keyword "block" is now unsupported. So the list of block rules is now unused. It can be safely removed from the structure proxy.	2019-07-19 09:24:12 +02:00
Christopher Faulet	a6a56e6483	MEDIUM: config: Remove parsing of req* and rsp* directives It was announced for the 2.1. Following keywords are now unsupported: * reqadd, reqallow, reqiallow, reqdel, reqidel, reqdeny, reqideny, reqpass, reqipass, reqrep, reqirep reqtarpit, reqitarpit * rspadd, rspdel, rspidel, rspdeny, rspideny, rsprep, rspirep a fatal error is emitted if one of these keyword is found during the configuraion parsing.	2019-07-19 09:24:12 +02:00
Christopher Faulet	73e8ede156	MINOR: proxy: Remove support of the option 'http-tunnel' The option 'http-tunnel' is deprecated and it was only used in the legacy HTTP mode. So this option is now totally ignored and a warning is emitted during HAProxy startup if it is found in a configuration file.	2019-07-19 09:24:12 +02:00
Christopher Faulet	fc9cfe4006	REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files The old module proto_http does not exist anymore. All code dedicated to the HTTP analysis is now grouped in the file proto_htx.c. So, to finish the polishing after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in http_ana.{c,h} files. In addition, all HTX analyzers and related functions prefixed with "htx_" have been renamed to start with "http_" instead.	2019-07-19 09:24:12 +02:00
Christopher Faulet	eb2754bef8	CLEANUP: proto_http: Remove unecessary includes and comments	2019-07-19 09:24:12 +02:00
Christopher Faulet	22dc248c2a	CLEANUP: channel: Remove the unused flag CF_WAKE_CONNECT This flag is tested or cleared but never set anymore.	2019-07-19 09:24:12 +02:00
Christopher Faulet	3716ebc50f	CLEANUP: proto_http: Group remaining flags of the HTTP transaction	2019-07-19 09:24:12 +02:00
Christopher Faulet	cc76d5b9a1	MINOR: proto_http: Remove the unused flag HTTP_MSGF_WAIT_CONN This flag is set but never used. So remove it.	2019-07-19 09:24:12 +02:00
Christopher Faulet	c41547b66e	MINOR: proto_http: Remove unused http txn flags Many flags of the HTTP transction (TX_) are now unused and useless. So the flags TX_WAIT_CLEANUP, TX_HDR_CONN_, TX_CON_CLO_SET and TX_CON_KAL_SET were removed. Most of TX_CON_WANT_* were also removed. Only TX_CON_WANT_TUN has been kept.	2019-07-19 09:24:12 +02:00
Christopher Faulet	711ed6ae4a	MAJOR: http: Remove the HTTP legacy code First of all, all legacy HTTP analyzers and all functions exclusively used by them were removed. So the most of the functions in proto_http.{c,h} were removed. Only functions to deal with the HTTP transaction have been kept. Then, http_msg and hdr_idx modules were entirely removed. And finally the structure http_msg was lightened of all its useless information about the legacy HTTP. The structure hdr_ctx was also removed because unused now, just like unused states in the enum h1_state. Note that the memory pool "hdr_idx" was removed and "http_txn" is now smaller.	2019-07-19 09:24:12 +02:00
Christopher Faulet	3d11969a91	MAJOR: filters: Remove code relying on the legacy HTTP mode This commit breaks the compatibility with filters still relying on the legacy HTTP code. The legacy callbacks were removed (http_data, http_chunk_trailers and http_forward_data). For now, the filters must still set the flag FLT_CFG_FL_HTX to be used on HTX streams.	2019-07-19 09:18:27 +02:00
Christopher Faulet	28b18c5e21	CLEANUP: proxy: Remove the flag PR_O2_USE_HTX This flag is now unused. So we can safely remove it.	2019-07-19 09:18:27 +02:00
Christopher Faulet	6d1dd46917	MEDIUM: http_fetch: Remove code relying on HTTP legacy mode Since the legacy HTTP mode is disbabled, all HTTP sample fetches work on HTX streams. So it is safe to remove all code relying on HTTP legacy mode. Among other things, the function smp_prefetch_http() was removed with the associated macros CHECK_HTTP_MESSAGE_FIRST() and CHECK_HTTP_MESSAGE_FIRST_PERM().	2019-07-19 09:18:27 +02:00
Christopher Faulet	c985f6c5d8	MINOR: connection: Remove the multiplexer protocol PROTO_MODE_HTX Since the legacy HTTP mode is disabled and no multiplexer relies on it anymore, there is no reason to have 2 multiplexer protocols for the HTTP. So the protocol PROTO_MODE_HTX was removed and all HTTP multiplexers use now PROTO_MODE_HTTP.	2019-07-19 09:18:27 +02:00
Christopher Faulet	5ed8353dcf	CLEANUP: h2: Remove functions converting h2 requests to raw HTTP/1.1 ones Because the h2 multiplexer only uses the HTX mode, following H2 functions were removed : * h2_prepare_h1_reqline * h2_make_h1_request() * h2_make_h1_trailers()	2019-07-19 09:18:27 +02:00
Christopher Faulet	24e116bfe0	MINOR: htx: Slightly update htx_dump() to report better messages Sign of <tail_addr>, <head_addr> and <end_addr> is respsected to not convert -1 into its unsigned representation.	2019-07-19 09:18:27 +02:00
Christopher Faulet	2bf43f0746	MINOR: htx: Use an array of char to store HTX blocks Instead of using a array of (struct block), it is more natural and intuitive to use an array of char. Indeed, not only (struct block) are stored in this array, but also their payload.	2019-07-19 09:18:27 +02:00
Christopher Faulet	192c6a23d4	MINOR: htx: Deduce the number of used blocks from tail and head values <head> and <tail> fields are now signed 32-bits integers. For an empty HTX message, these fields are set to -1. So the field <used> is now useless and can safely be removed. To know if an HTX message is empty or not, we just compare <head> against -1 (it also works with <tail>). The function htx_nbblks() has been added to get the number of used blocks.	2019-07-19 09:18:27 +02:00
Christopher Faulet	5a916f7326	CLEANUP: htx: Remove the unsued function htx_add_blk_type_size()	2019-07-19 09:18:27 +02:00
Christopher Faulet	3b21972061	DOC: htx: Update comments in HTX files This patch may be backported to 2.0 to have accurate comments.	2019-07-19 09:18:27 +02:00
Christopher Faulet	304cc40536	MINOR: proto_htx: Add the function htx_return_srv_error() Instead of using a function from the legacy HTTP, the HTX code now uses its own one.	2019-07-19 09:18:27 +02:00
Willy Tarreau	8280ea97a0	MINOR: applet: make appctx use their own pool A long time ago, applets were seen as an alternative to connections, and since their respective sizes were roughly equal it appeared wise to share the same pool. Nowadays, connections got significantly larger but applets are not that often used, except for the cache. However applets are mostly complementary and not alternatives anymore, as it's very possible not to have a back connection or to share one with other streams. The connections will soon lose their addresses and their size will shrink so much that appctx won't fit anymore. Given that the old benefits of sharing these pools have long disappeared, let's stop doing this and have a dedicated pool for appctx.	2019-07-18 10:45:08 +02:00
Willy Tarreau	7764a57d32	BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored Since commit `81492c989` ("MINOR: threads: flatten the per-thread cpu-map"), we don't keep the procthread matrix anymore to represent the full binding possibilities, but only the proc and thread ones. The problem is that the per-process binding is not the same for each thread and for the process, and the proc[] array was assumed to store the per-proc first thread value when doing this change. Worse, the logic present there tries to deal with thread ranges and process ranges in a way which automatically exclused the other possibility (since ranges cannot be used on both) but as such fails to apply changes if neither the process nor the thread is expressed as a range. The real problem comes from the fact that specifying cpu-map 1/1 doesn't yet reveal if the per-process mask or the per-thread mask needs to be updated. In practice it's the thread one but then the current storage doesn't allow to store the binding of the first thread of each other process in nbproc>1 configurations. When removing the procthread matrix, what ought to have been kept was both the thread column for process 1 and the process line for threads 1, but instead only the thread column was kept. This patch reintroduces the storage of the configuration for the first thread of each process so that it is again possible to store either the per-thread or per-process configuration. As a partial workaround for existing configurations, it is possible to systematically indicate at least two processes or two threads at once and map them by pairs or more so that at least two values are present in the range. E.g : # set processes 1-4 to cpus 0-3 : cpu-map auto:1-4/1 0 1 2 3 # or: cpu-map 1-2/1 0 1 cpu-map 2-3/1 2 3 # set threads 1-4 to cpus 0-3 : cpu-map auto:1/1-4 0 1 2 3 # or : cpu-map 1/1-2 0 1 cpu-map 3/3-4 2 3 This fix must be backported to 2.0.	2019-07-16 15:23:09 +02:00
Andrew Heberle	9723696759	MEDIUM: mworker-prog: Add user/group options to program section This patch adds "user" and "group" config options to the "program" section so the configured command can be run as a different user.	2019-07-15 16:43:16 +02:00
Olivier Houchard	4bd5867627	BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set. Move the logic to decide if we redispatch to a new server from sess_update_st_cer() to a new inline function, stream_choose_redispatch(), and use it in do_l7_retry() instead of just setting the state to SI_ST_REQ. That way, when using L7 retries, we won't redispatch the request to another server except if "option redispatch" is used. This should be backported to 2.0.	2019-07-12 16:17:50 +02:00
Willy Tarreau	64e6012eb9	MINOR: task: introduce work lists Sometimes we need to delegate some list processing to a function running on another thread. In this case the list element will simply be queued into a dedicated self-locked list and the task responsible for this list will be woken up, calling the associated function which will run over the list. This is what work_list does. Such lists will be dedicated to a limited type of work but will significantly ease such remote handling. A function is provided to create these per-thread lists, their tasks and to properly bind each task to a distinct thread, so that the caller only has to store the resulting pointer to the start of the structure. These structures should not be abused though as each head will consume 4 pointers per thread, hence 32 bytes per thread or 2 kB for 64 threads.	2019-07-12 09:07:48 +02:00
Olivier Houchard	4be7190c10	BUG/MEDIUM: servers: Fix a race condition with idle connections. When we're purging idle connections, there's a race condition, when we're removing the connection from the idle list, to add it to the list of connections to free, if the thread owning the connection tries to free it at the same time. To fix this, simply add a per-thread lock, that has to be hold before removing the connection from the idle list, and when, in conn_free(), we're about to remove the connection from every list. That way, we know for sure the connection will stay valid while we remove it from the idle list, to add it to the list of connections to free. This should happen rarely enough that it shouldn't have any impact on performances. This has not been reported yet, but could provoke random segfaults. This should be backported to 2.0.	2019-07-11 16:16:38 +02:00
Christopher Faulet	34ce7d075a	BUG/MINOR: server: Be really able to keep "pool-max-conn" idle connections The maximum number of idle connections for a server can be configured by setting the server option "pool-max-conn". But when we try to add a connection in its idle list, because of a wrong comparison, it may be rejected because there are already "pool-max-conn - 1" idle connections. This patch must be backported to 2.0 and 1.9.	2019-07-10 14:20:52 +02:00
Willy Tarreau	1dad3843dc	BUG/MEDIUM: fd/threads: fix excessive CPU usage on multi-thread accept While experimenting with potentially improved fairness and latency using ticket locks on a Ryzen 16-thread/8-core, a very strange situation happened a lot for some levels of traffic. Around 300k connections per second, no more connections would be accepted on the multi-threaded listener but all others would continue to work fine. All attempts to trace showed that the threads were all in the trylock in the fd cache, or in the spinlock of fd_update_events(), or in the one of fd_may_recv(). But as indicated this was not a deadlock since the process continues to work fine. After quite some investigation it appeared that the issue is caused by a lack of fairness between the fdcache's trylock and these functions' spin locks above. In fact, regardless of the success or failure of the fdcache's attempt at grabbing the lock, the poller was calling fd_update_events() which locks the FD once for something that can be done with a CAS, and then calls fd_may_recv() with another lock for something that most often didn't change. The high contention on these spinlocks leaves no chance to any other thread to grab the lock using trylock(), and once this happens, there is no thread left to process incoming connection events nor to stop polling on the FD, leaving all threads at 100% CPU but partially operational. This patch addresses the issue by using bit-test-and-set instead of the OR in fd_may_recv() / fd_may_send() so that nothing is done if the FD was already configured as expected. It does the same in fd_update_events() using a CAS to check if the FD's events need to be changed at all or not. With this patch applied, it became impossible to reproduce the issue, and now there's no way to saturate all 16 CPUs with the load used for testing, as no more than 1350-1400 were noticed at 300+kcps vs 1600. Ideally this patch should go further and try to remove the remaining incarnations of the fdlock as this seems possible, but it's difficult enough to be done in a distinct patch that will not have to be backported. It is possible that workloads involving a high connection rate may slightly benefit from this patch and observe a slightly lower CPU usage even when the service doesn't misbehave. This patch must be backported to 2.0 and 1.9.	2019-07-09 10:41:24 +02:00
Willy Tarreau	85b2cae63c	MINOR: pools: make the thread harmless during the mmap/munmap syscalls These calls can take quite some time and leave the thread harmless so it's better to mark it as such. This makes "show sess" respond way faster during high loads running on processes build with DEBUG_UAF since these calls are stressed a lot.	2019-07-09 10:40:33 +02:00
Willy Tarreau	828675421e	MINOR: pools: always pre-initialize allocated memory outside of the lock When calling mmap(), in general the system gives us a page but does not really allocate it until we first dereference it. And it turns out that this time is much longer than the time to perform the mmap() syscall. Unfortunately, when running with memory debugging enabled, we mmap/munmap() each object resulting in lots of such calls and a high contention on the allocator. And the first accesses to the page being done under the pool lock is extremely damaging to other threads. The simple fact of writing a 0 at the beginning of the page after allocating it and placing the POOL_LINK pointer outside of the lock is enough to boost the performance by 8x in debug mode and to save the watchdog from triggering on lock contention. This is what this patch does.	2019-07-09 10:40:33 +02:00
Willy Tarreau	3e853ea74d	MINOR: pools: release the pool's lock during the malloc/free calls The malloc and free calls and especially the underlying mmap/munmap() can occasionally take a huge amount of time and even cause the thread to sleep. This is visible when haproxy is compiled with DEBUG_UAF which causes every single pool allocation/free to allocate and release pages. In this case, when using the locked pools, the watchdog can occasionally fire under high contention (typically requesting 40000 1M objects in parallel over 8 threads). Then, "perf top" shows that 50% of the CPU time is spent in mmap() and munmap(). The reason the watchdog fires is because some threads spin on the pool lock which is held by other threads waiting on mmap() or munmap(). This patch modifies this so that the pool lock is released during these syscalls. Not only this allows other threads to request try to allocate their data in parallel, but it also considerably reduces the lock contention. Note that the locked pools are only used on small architectures where high thread counts would not make sense, so this will not provide any benefit in the general case. However it makes the debugging versions way more stable, which is always appreciated.	2019-07-09 10:40:33 +02:00
Christopher Faulet	037b3ebd35	BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si In the function stream_int_notify(), when the opposite stream-interface is blocked because there is no more room into the input buffer, if the flag CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock the reads on the other side because some data was sent. But it is a problem during the fast-forwarding because only the stream is able to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because of a previous send while the input buffer of the opposite stream-interface is now full. In such case, the opposite stream-interface will be woken up for nothing because its input buffer is full. If the same happens on the opposite side, we will have a loop consumming all the CPU. To fix the bug, the opposite side is now only notify if there is some available room in its input buffer in the function si_cs_send(), so only if some data was sent. This patch must be backported to 2.0 and 1.9.	2019-07-05 14:26:15 +02:00
Christopher Faulet	2e4843d1d2	MINOR: action: Add the return code ACT_RET_DONE for actions This code should be now used by action to stop at the same time the rules processing and the possible following processings. And from its side, the return code ACT_RET_STOP should be used to only stop rules processing. So concretely, for TCP rules, there is no changes. ACT_RET_STOP and ACT_RET_DONE are handled the same way. However, for HTTP rules, ACT_RET_STOP should now be mapped on HTTP_RULE_RES_STOP and ACT_RET_DONE on HTTP_RULE_RES_DONE. So this way, a action will have the possibilty to stop all processing or only rules processing. Note that changes about the TCP is done in this commit but changes about the HTTP will be done in another one because it will fix a bug in the same time. This patch must be backported to 2.0 because a bugfix depends on it.	2019-07-05 14:26:14 +02:00
Olivier Houchard	cee0389088	BUG/MEDIUM: sessions: Don't keep an extra idle connection in sessions. When deciding if we keep an idle connection in the session, check if the number of connections currently in the session is >= the max allowed, not >, or we'll keep an extra connection. This should be backported to 1.9 and 2.0.	2019-07-04 14:28:18 +02:00
Olivier Houchard	2ab3dada01	BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux. Just calling conn_force_unsubscribe() from conn_upgrade_mux_fe() is not enough, as there may be multiple XPRT involved. Instead, require that any user of conn_upgrade_mux_fe() unsubscribe itself before calling it. This should fix upgrading a TCP connection to HTX when using SSL. This should be backported to 2.0.	2019-07-03 13:57:30 +02:00
Christopher Faulet	621da6bafa	BUG/MEDIUM: channel/htx: Use the total HTX size in channel_htx_recv_limit() The receive limit of an HTX channel must be calculated against the total size of the HTX message. Otherwise, the buffer may never be seen as full whereas the receive limit is 0. Indeed, the function channel_htx_full() already takes care to add a block size to the buffer's reserve (8 bytes). So if the function channel_htx_recv_limit() also keep a block size free in addition to the buffer's reserve, it means that at least 2 block size will be kept free but only one will be taken into account, freezing the stream if the option http-buffer-request is enabled. This patch fixes the Github issue #136. It should be backported to 2.0 and 1.9. Thanks jaroslawr (Jarosław Rzeszótko) for his help.	2019-07-02 21:32:45 +02:00
Olivier Houchard	6c7e96a3e1	BUG/MEDIUM: connections: Always call shutdown, with no linger. Revert commit `fe4abe62c7`. The goal was to make sure for health-checks, we would not get sockets in TIME_WAIT. To do so, we would not call shutdown() if linger_risk is set. However that is wrong, and that means shutw would never be forwarded to the server, and thus we could get connection that are never properly closed. Instead, to fix the original problem as described here : https://www.mail-archive.com/haproxy@formilux.org/msg34080.html Just make sure the checks code call cs_shutr() before calling cs_shutw(). If shutr has been called, conn_sock_shutw() will make no attempt to call shutdown(), as it knows close() will be called. We should really review and revamp the shutr/shutw code, as described in github issue #142. This should be backported to 1.9 and 2.0.	2019-07-02 16:40:55 +02:00
William Lallemand	ad03288e6b	BUG/MINOR: mworker/cli: don't output a \n before the response When using a level lower than admin on the master CLI, a \n is output before the response, this is caused by the response of the "operator" or "user" that are sent before the actual command. To fix this problem we introduce the flag APPCTX_CLI_ST1_NOLF which ask a command response to not be followed by the final \n. This patch made a special case with the command operator and user followed by a - so they are not followed by \n. This patch must be backported to 2.0 and 1.9.	2019-07-01 15:34:11 +02:00
Christopher Faulet	bb0efcdd29	MINOR: htx: Add the function htx_change_blk_value_len() As its name suggest, this function change the value length of a block. But it also update the HTX message accordingly. It simplifies the HTX API. The function htx_set_blk_value_len() is still available and must be used with caution because this one does not update the HTX message. It just updates the HTX block. It should be considered as an internal function. When possible, htx_change_blk_value_len() should be used instead. This function is used to fix a bug affecting the 2.0. So, this patch must be backported to 2.0.	2019-06-18 10:01:55 +02:00
Baptiste Assmann	da29fe2360	MEDIUM: server: server-state global file stored in a tree Server states can be recovered from either a "global" file (all backends) or a "local" file (per backend). The way the algorithm to parse the state file was first implemented was good enough for a low number of backends and servers per backend. Basically, for each backend the state file (global or local) is opened, parsed entirely and for each line we check if it contains data related to a server from the backend we're currently processing. We must read the file entirely, just in case some lines for the current backend are stored at the end of the file. This does not scale at all! This patch changes the behavior above for the "global" file only. Now, the global file is read and parsed once and all lines it contains are stored in a tree, for faster discovery. This result in way much less fopen, fgets, and strcmp calls, which make loading of very big state files very quick now.	2019-06-17 13:40:42 +02:00
Tim Duesterhus	86e6b6ebf8	MEDIUM: Make '(cli\|con\|srv)timeout' directive fatal They were deprecated with HAProxy 1.5. Time to remove them.	2019-06-17 13:35:54 +02:00
Tim Duesterhus	dac168bc15	MEDIUM: Make 'redispatch' directive fatal It was deprecated with HAProxy 1.5. Time to remove it.	2019-06-17 13:35:54 +02:00
Tim Duesterhus	7b7c47f05c	MEDIUM: Make 'block' directive fatal It was deprecated with HAProxy 1.5. Time to remove it.	2019-06-17 13:35:54 +02:00
Willy Tarreau	9dc6b97429	[RELEASE] Released version 2.1-dev0 Released version 2.1-dev0 with the following main changes : - exact copy of 2.0.0	2019-06-16 21:49:47 +02:00
Willy Tarreau	bd20a9dd4e	BUG: tasks: fix bug introduced by latest scheduler cleanup In commit `86eded6c6` ("CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*") which consisted in removing the casts between tasks and tasklet, I was a bit too fast to believe that we only saw tasklets in this function since process_runnable_tasks() also uses it with tasks under a cast. So removing the bookkeeping on task_list_size was not appropriate. Bah, the joy of casts which hide the real thing... This patch does two things at once to address this mess once for all: - it restores the decrement of task_list_size when it's a real task, but moves it to process_runnable_task() since it's the only place where it's allowed to call it with a task - it moves the increment there as well and renames task_insert_into_tasklet_list() to tasklet_insert_into_tasklet_list() of obvious consistency reasons. This way the increment/decrement of task_list_size is made at the only places where the cast is enforced, so it has less risks to be missed. The comments on top of these functions were updated to reflect that they are only supposed to be used with tasklets and that the caller is responsible for keeping task_list_size up to date if it decides to enforce a task there. Now we don't have to worry anymore about how these functions work outside of the scheduler, which is better longterm-wise. Thanks to Christopher for spotting this mistake. No backport is needed.	2019-06-14 18:16:19 +02:00
Olivier Houchard	fe4abe62c7	BUG/MEDIUM: connections: Don't call shutdown() if we want to disable linger. In conn_sock_shutw(), avoid calling shutdown() if linger_risk is set. Not doing so will result in getting sockets in TIME_WAIT for some time. This is particularly observable with health checks. This should be backported to 1.9.	2019-06-14 15:33:41 +02:00
Willy Tarreau	86eded6c69	CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_* The function really only operates on tasklets, its arguments are always tasklets cast as tasks to match the function's type, to be cast back to a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(), take a struct tasklet, and get rid of the undesired task casts.	2019-06-14 14:57:03 +02:00
Willy Tarreau	3c39a7d889	CLEANUP: connection: rename the wait_event.task field to .tasklet It's really confusing to call it a task because it's a tasklet and used in places where tasks and tasklets are used together. Let's rename it to tasklet to remove this confusion.	2019-06-14 14:42:29 +02:00
Christopher Faulet	e21c01637a	MINOR: htx: Add 3 flags on the start-line to deal with the request schemes The first one, HTX_SL_F_HAS_SCHM, will be used to know the request has an explicit scheme. So, in H2, it is always true because the pseudo-header ":scheme" is mandatory. In H1, it is only true when an absolute URI is found on the start-line. The other flags, HTX_SL_F_SCHM_HTTP and HTX_SL_F_SCHM_HTTPS, will be used to know which scheme the request have. For now, other protocols are not handled. The aim of these flags is to pass this information to the backend side in general, and to the H2 mux in particular. So the multiplexer will have a chance to use this information to send the right scheme to the server.	2019-06-14 11:13:32 +02:00
Christopher Faulet	36a7702b03	CLEANUP: channel: Remove channel_htx_fwd_payload() and channel_htx_fwd_all() These functions are unused now. No backport needed.	2019-06-14 11:13:32 +02:00
Christopher Faulet	421e769783	BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis In the HTX structure, the field <first> is used to know where to (re)start the analysis. It may differ from the message's head. It is especially important to update it to handle 1xx messages, to be sure to restart the analysis on the next message (another 1xx message or the final one). It is also updated when some data are forwarded (the headers or part of the body). But this update is an error and must never be done at the analysis level. It is a bug, because some sample fetches may be used after the data forwarding (but before the first send of course). At this stage, if the first block position does not point on the start-line, most of HTTP sample fetches fail. So now, when something is forwarding by HTX analyzers, the first block position is not update anymore. This issue was reported on Github. See #119. No backport needed.	2019-06-14 11:13:32 +02:00
Christopher Faulet	87ebe944d6	BUG/MINOR: channel/htx: Call channel_htx_full() from channel_full() When channel_full() is called for an HTX stream, we fall back on the HTX version. This function is called, among other, from tcp_inspect_request(). With this patch, the inspect delay is respected again. This patch must be backported to 1.9.	2019-06-14 11:13:32 +02:00
Willy Tarreau	3cec0f94f3	BUG/MINOR: task: prevent schedulable tasks from starving under high I/O activity With both I/O and tasks in the same tasklet list, we now have a very smooth and responsive scheduler, providing a good fairness between I/O activities. With the lower layers relying on tasklet a lot (I/O wakeup, subscribe, etc), there may often be a large number of totally autonomous tasklets doing their business such as forwarding data between two muxes. But the task scheduler historically refrained from picking tasks from the priority-ordered run queue to put them into the tasklet list until this later had less than max_runqueue_depth entries. This was to make sure that low-latency, high-priority tasks would have an opportunity to be dequeued before others even if they arrive late. But the counter used for this is still the tasklet list size, which contains countless I/O events. This causes an unfairness between unbounded I/Os and bounded tasks, resulting for example in the CLI responding slower when forwarding 40 Gbps of HTTP traffic spread over a thousand of connections. A good solution consists in sticking to the initial intent of max_runqueue_depth which is to limit the number of tasks in the list (to maintain fairness between them) and not to limit the number of these tasks among tasklets. It just turns out that the task_list_size initially was this task counter and changed over time to be a tasklet list size. Let's simply refrain from updating it for pure tasklets so that it takes back its original role of counting real tasks as its name implies. With this change the CLI becomes instantly responsive under load again. This patch may possibly be backported to 1.9 though it requires some careful checks.	2019-06-14 09:16:51 +02:00
William Lallemand	1dc6963086	MINOR: mworker: add the HAProxy version in "show proc" Displays the HAProxy version so you can compare the version of old processes and new ones.	2019-06-12 19:19:57 +02:00
Olivier Houchard	a0fdce3950	MINOR: fd: Don't use atomic operations when it's not needed. In updt_fd_polling(), when updating fd_nbupdt, there's no need to use an atomic operation, as it's a TLS variable.	2019-06-12 14:36:24 +02:00
Christopher Faulet	86fcf6d6cd	MINOR: htx: Add the function htx_move_blk_before() The function htx_add_data_before() was removed because it was buggy. The function htx_move_blk_before() may be used if necessary to do something equivalent, except it just moves blocks. It doesn't handle the adding.	2019-06-11 14:05:25 +02:00
Christopher Faulet	d7884d3449	MAJOR: htx: Rework how free rooms are tracked in an HTX message In an HTX message, it may have 2 available rooms to store a new block. The first one is between the blocks and their payload. Blocks are added starting from the end of the buffer and their payloads are added starting from the begining. So the first free room is between these 2 edges. The second one is at the begining of the buffer, when we start to wrap to add new payloads. Once we start to use this one, the other one is ignored until the next defragmentation of the HTX message. In theory, there is no problem. But in practice, some lacks in the HTX structure force us to defragment too often HTX messages to always be in a known state. The second free room is not tracked as it should do and the first one may be easily corrupted when rewrites happen. So to fix the problem and avoid unecessary defragmentation, the HTX structure has been refactored. The front (the block's position of the first payload before the blocks) is no more stored. Instead we keep the relative addresses of 3 edges: * tail_addr : The start address of the free space in front of the the blocks table * head_addr : The start address of the free space at the beginning * end_addr : The end address of the free space at the beginning Here is the general view of the HTX message now: head_addr end_addr tail_addr \| \| \| V V V +------------+------------+------------+------------+------------------+ \| \| \| \| \| \| \| PAYLOAD \| Free space \| PAYLOAD \| Free space \| Blocks area \| \| ==> \| 1 \| ==> \| 2 \| <== \| +------------+------------+------------+------------+------------------+ <head_addr> is always lower or equal to <end_addr> and <tail_addr>. <end_addr> is always lower or equal to <tail_addr>. In addition;, to simplify everything, the blocks area are now contiguous. It doesn't wrap anymore. So the head is always the block with the lowest position, and the tail is always the one with the highest position.	2019-06-11 14:05:25 +02:00
Christopher Faulet	86bc8df955	BUG/MEDIUM: compression/htx: Fix the adding of the last data block The function htx_add_data_before() is buggy and cannot work. It first add a data block and then move it before another one, passed in argument. The problem happens when a defragmentation is done to add the new block. In this case, the reference is no longer valid, because the blocks are rearranged. So, instead of moving the new block before the reference, it is moved at the head of the HTX message. So this function has been removed. It was only used by the compression filter to add a last data block before a TLR, EOT or EOM block. Now, the new function htx_add_last_data() is used. It adds a last data block, after all others and before any TLR, EOT or EOM block. Then, the next bock is get. It is the first non-data block after data in the HTX message. The compression loop continues with it. This patch must be backported to 1.9.	2019-06-11 14:05:25 +02:00
Willy Tarreau	9a1f57351d	MEDIUM: threads: add thread_sync_release() to synchronize steps This function provides an alternate way to leave a critical section run under thread_isolate(). Currently, a thread may remain in thread_release() without having the time to notice that the rdv mask was released and taken again by another thread entering thread_isolate() (often the same that just released it). This is because threads wait in harmless mode in the loop, which is compatible with the conditions to enter thread_isolate(). It's not possible to make them wait with the harmless bit off or we cannot know when the job is finished for the next thread to start in thread_isolate(), and if we don't clear the rdv bit when going there, we create another race on the start point of thread_isolate(). This new synchronous variant of thread_release() makes use of an extra mask to indicate the threads that want to be synchronously released. In this case, they will be marked harmless before releasing their sync bit, and will wait for others to release their bit as well, guaranteeing that thread_isolate() cannot be started by any of them before they all left thread_sync_release(). This allows to construct synchronized blocks like this : thread_isolate() /* optionally do something alone here / thread_sync_release() / do something together here / thread_isolate() / optionally do something alone here */ thread_sync_release() And so on. This is particularly useful during initialization where several steps have to be respected and no thread must start a step before the previous one is completed by other threads. This one must not be placed after any call to thread_release() or it would risk to block an earlier call to thread_isolate() which the current thread managed to leave without waiting for others to complete, and end up here with the thread's harmless bit cleared, blocking others. This might be improved in the future.	2019-06-10 09:42:43 +02:00
Willy Tarreau	9faebe34cd	MEDIUM: tools: improve time format error detection As reported in GH issue #109 and in discourse issue https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d the time parser doesn't error on overflows nor underflows. This is a recurring problem which additionally has the bad taste of taking a long time before hitting the user. This patch makes parse_time_err() return special error codes for overflows and underflows, and adds the control in the call places to report suitable errors depending on the requested unit. In practice, underflows are almost never returned as the parsing function takes care of rounding values up, so this might possibly happen on 64-bit overflows returning exactly zero after rounding though. It is not really possible to cut the patch into pieces as it changes the function's API, hence all callers. Tests were run on about every relevant part (cookie maxlife/maxidle, server inter, stats timeout, timeout*, cli's set timeout command, tcp-request/response inspect-delay).	2019-06-07 19:32:02 +02:00
Fr�d�ric L�caille	b65717fa55	MINOR: peers: Optimization for dictionary cache lookup. When we look up an dictionary entry in the cache used upon transmission we store the last result in ->prev_lookup of struct dcache_tx so that to compare it with the subsequent entries to look up and save performances.	2019-06-07 15:47:54 +02:00
Fr�d�ric L�caille	99de1d0479	MINOR: dict: Store the length of the dictionary entries. When allocating new dictionary entries we store the length of the strings. May be useful so that not to have to call strlen() too much often at runing time.	2019-06-07 15:47:54 +02:00
Fr�d�ric L�caille	6c39198b57	MINOR peers: data structure simplifications for server names dictionary cache. We store pointers to server names dictionary entries in a pre-allocated array of ebpt_node's (->entries member of struct dcache_tx) to cache those sent to remote peers. Consequently the ID used to identify the server name dictionary entry is also used as index for this array. There is no need to implement a lookup by key for this dictionary cache.	2019-06-07 15:47:54 +02:00
Willy Tarreau	1bfd6020ce	MINOR: logs: use the new bitmap functions instead of fd_sets for encoding maps The fd_sets we've been using in the log encoding functions are not portable and were shown to break at least under Cygwin. This patch gets rid of them in favor of the new bitmap functions. It was verified with the config below that the log output was exactly the same before and after the change : defaults mode http option httplog log stdout local0 timeout client 1s timeout server 1s timeout connect 1s frontend foo bind :8001 capture request header chars len 255 backend bar option httpchk "GET" "/" "HTTP/1.0\r\nchars: \x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" server foo 127.0.0.1:8001 check	2019-06-07 11:13:24 +02:00
Willy Tarreau	7355b040d1	MINOR: tools: add new bitmap manipulation functions We now have ha_bit_{set,clr,flip,test} to manipulate bitfields made of arrays of longs. The goal is to get rid of the remaining non-portable FD_{SET,CLR,ISSET} that still exist at a few places.	2019-06-07 10:44:49 +02:00
Willy Tarreau	ad660e3f84	BUILD: stream-int: avoid a build warning in dev mode in si_state_bit() The BUG_ON() test emits a warning about an always-true comparison regarding <state> which cannot be lower than zero. Let's get rid of it.	2019-06-06 16:42:08 +02:00
Willy Tarreau	3b285d7fbd	MINOR: stream-int: make si_sync_send() from the send code of si_update_both() Just like we have a synchronous recv() function for the stream interface, let's have a synchronous send function that we'll be able to call from different places. For now this only moves the code, nothing more.	2019-06-06 16:36:19 +02:00
Willy Tarreau	236c4298b3	MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx() We should not update the two directions at once, in fact we should update the Rx path after recv() and the Tx path after send(). Let's start by splitting the update function in two for this.	2019-06-06 16:36:19 +02:00
Willy Tarreau	8c603ded39	MEDIUM: stream-int: make idle-conns switch to ST_RDY The purpose of making idle-conns switch to SI_ST_CON was to make the transition detectable and the operation retryable in case of connection error. Now we have the RDY state for this which is much more suitable since it indicates a validated connection on which we didn't necessarily send anything yet. This will still lead to a transition to EST while not requiring unnatural write polling nor connect timeouts.	2019-06-06 16:36:19 +02:00
Willy Tarreau	4f283fa604	MEDIUM: stream-int: introduce a new state SI_ST_RDY The main reason for all the trouble we're facing with stream interface error or timeout reports during the connection phase is that we currently can't make the difference between a connection attempt and a validated connection attempt. It is problematic because we tend to switch early to SI_ST_EST but can't always do what we want in this state since it's supposed to be set when we don't need to visit sess_establish() again. This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which is SI_ST_RDY. It indicates that we've verified that the connection is ready. It's a transient state, like SI_ST_DIS, that cannot persist when leaving process_stream(). For now it is not set, only verified in various tests where SI_ST_CON was used or SI_ST_EST depending on the cases. The stream-int state diagram was minimally updated to reflect the new state, though it is largely obsolete and would need to be seriously updated.	2019-06-06 16:36:19 +02:00
Willy Tarreau	7ab22adbf7	MEDIUM: stream-int: remove dangerous interval checks for stream-int states The stream interface state checks involving ranges were replaced with checks on a set of states, already revealing some issues. No issue was fixed, all was replaced in a one-to-one mapping for easier control. Some checks involving a strict difference were also replaced with fields to be clearer. At this stage, the result must be strictly equivalent. A few tests were also turned to their bit-field equivalent for better readability or in preparation for upcoming changes. The test performed in the SPOE filter was swapped so that the closed and error states are evicted first and that the established vs conn state is tested second.	2019-06-06 16:36:19 +02:00
Willy Tarreau	bedcd698b3	MINOR: stream-int: use bit fields to match multiple stream-int states at once At some places we do check for ranges of stream-int states but those are confusing as states ordering is not well known (e.g. it's not obvious that CER is between CON and EST). Let's create a bit field from states so that we can match multiple states at once instead. The new enum si_state_bit contains SI_SB_* which are state bits instead of state values. The function si_state_in() indicates if the state in argument is one of those represented by the bit mask in second argument.	2019-06-06 16:36:19 +02:00
Olivier Houchard	03abf2d31e	MEDIUM: connections: Remove CONN_FL_SOCK* Now that the various handshakes come with their own XPRT, there's no need for the CONN_FL_SOCK* flags, and the conn_sock_want\|stop functions, so garbage-collect them.	2019-06-05 18:03:38 +02:00
Olivier Houchard	fe50bfb82c	MEDIUM: connections: Introduce a handshake pseudo-XPRT. Add a new XPRT that is used when using non-SSL handshakes, such as proxy protocol or Netscaler, instead of taking care of it in conn_fd_handler(). This XPRT is installed when any of those is used, and it removes itself once the handshake is done. This should allow us to remove the distinction between CO_FL_SOCK* and CO_FL_XPRT*.	2019-06-05 18:03:38 +02:00
Olivier Houchard	2e055483ff	MINOR: connections: Add a new xprt method, add_xprt(). Add a new method to xprt_ops, add_xprt(), that changes the underlying xprt to the one provided, and optionally provide the old one.	2019-06-05 18:03:38 +02:00
Olivier Houchard	5149b59851	MINOR: connections: Add a new xprt method, remove_xprt. Add a new method to xprt_ops, remove_xprt. When called, if the provided xprt_ctx is the same as the xprt's underlying xprt_ctx, it then uses the new xprt provided, otherwise it calls the remove_xprt method of the next xprt. The goal is to be able to add a temporary xprt, that removes itself from the chain when it did what it had to do. This will be used to implement a pseudo-xprt for anything that just requires a handshake (such as the proxy protocol).	2019-06-05 18:03:38 +02:00
Olivier Houchard	000694cf96	MINOR: ssl: Make ssl_sock_handshake() static. ssl_sock_handshake is now only used by the ssl code itself, there's no need to export it anymore, so make it static.	2019-06-05 18:03:38 +02:00
Olivier Houchard	ea8dd949e4	MEDIUM: ssl: Handle subscribe by itself. As the SSL code may have different needs than the upper layer, ie it may want to receive when the upper layer wants to right, instead of directly forwarding the subscribe to the underlying xprt, handle it ourself. The SSL code will know remember any subscribe call, and wake the tasklet when it is ready for more I/O.	2019-06-05 18:03:38 +02:00
Christopher Faulet	54b5e214b0	MINOR: htx: Don't use end-of-data blocks anymore This type of blocks is useless because transition between data and trailers is obvious. And when there is no trailers, the end-of-message is still there to know when data end for chunked messages.	2019-06-05 10:12:11 +02:00
Christopher Faulet	2d7c5395ed	MEDIUM: htx: Add the parsing of trailers of chunked messages HTTP trailers are now parsed in the same way headers are. It means trailers are converted to K/V blocks followed by an end-of-trailer marker. For now, to make things simple, the type for trailer blocks are not the same than for header blocks. But the aim is to make no difference between headers and trailers by using the same type. Probably for the end-of marker too.	2019-06-05 10:12:11 +02:00
Christopher Faulet	8f3c256f7e	MEDIUM: cache/htx: Always store info about HTX blocks in the cache It was only done for the headers (including the EOH marker). data were prefixed by the info field of these blocks. The payload and the trailers of the messages were stored in raw. The total size of headers and payload were kept in the cached object state to help output formatting. Now, info about each HTX block is store in the cache. Only data are allowed to be splitted. Otherwise, all blocks of an HTX message are handled the same way, both when storing a message in the cache and when delivering it from the cache. This will help the cache implementation to be more robust to internal changes in the HTX. Especially for the upcoming parsing of trailers. There is also no more need to keep extra info in the cached object state.	2019-06-05 10:12:11 +02:00
Christopher Faulet	a4f9dd4a56	BUG/MINOR: channel/htx: Don't alter channel during forward for empty HTX message In channel_htx_forward() and channel_htx_forward_forever(), if the HTX message is empty, the underlying buffer may be really empty too. And we have no warranty the caller will call htx_to_buf() later. And in practice, it is almost never done. So the channel's buffer must not be altered. Otherwise, the buffer may be considered as full (data == size) for an empty HTX message and no outgoing data. This patch must be backported to 1.9.	2019-06-05 10:12:11 +02:00
Fr�d�ric L�caille	8d78fa7def	MINOR: peers: Make peers protocol support new "server_name" data type. Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache) so that to send/receive server names used for the server by name stickiness. These names are sent over the network as follows: - in every case we send the encode length of the data (STD_T_DICT), then - if the server names is not present in the cache used upon transmission (struct dcache_tx) we cache it and we the ID of this TX cache entry followed the encode length of the server name, and finally the sever name itseft (non NULL terminated string). - if the server name is present, we repead these operations but we only send the TX cache entry ID. Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact that the server name is present in the received data (resp. or not) denotes if the entry is absent (resp. or not).	2019-06-05 08:42:33 +02:00
Fr�d�ric L�caille	7da71293e4	MINOR: server: Add a dictionary for server names. This patch only declares and defines a dictionary for the server names (stored as ->id member field).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	84d6046a33	MINOR: proxy: Add a "server by name" tree to proxy. Add a tree to proxy struct to lookup by name for servers attached to this proxy and populated it at parsing time.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	5ad57ea85f	MINOR: stick-table: Add "server_name" new data type. This simple patch only adds definitions to create a new stick-table data type ID and a new standard type to store information in relation wich dictionary entries (STD_T_DICT).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	74167b25f7	MINOR: peers: Add a LRU cache implementation for dictionaries. We want to send some stick-table data fields stored as strings in dictionaries without consuming too much memory and CPU. To do so we implement with this patch a cache for send/received dictionaries entries. These dictionary of strings entries are stored in others real dictionary entries with an identifier as key (unsigned int) and a pointer to the dictionary of strings entries as values.	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	4a3fef834c	MINOR: dict: Add dictionary new data structure. This patch adds minimalistic definitions to implement dictionary new data structure which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing to see with real dictionary data structure (maps of keys in association with values).	2019-06-05 08:33:35 +02:00
Fr�d�ric L�caille	1673bbdf98	CLEANUP: peers: Remove tabs characters. This patch only replaces very annoying tabulation characters by spaces so that not to have to use again tabulations where they should not be used.	2019-06-05 08:33:34 +02:00
Willy Tarreau	7bb39d7cd6	CLEANUP: connection: remove the now unused CS_FL_REOS flag Let's remove it before it gets uesd again. It was mostly replaced with CS_FL_EOI and by mux-specific states or flags.	2019-06-03 14:23:33 +02:00
Willy Tarreau	7067b3a92e	BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit As reported in GH issue #99, when hard-stop-after triggers and threads are in use, the chance that any thread releases the resources in use by the other ones is non-null. Thus no thread should be allowed to deinit() nor exit by itself. Here we take a different approach. We simply use a 3rd possible value for the "killed" variable so that all threads know they must break out of the run-poll-loop and immediately stop. This patch was tested by commenting the stream_shutdown() calls in hard_stop() to increase the chances to see a stream use released resources. With this fix applied, it never crashes anymore. This fix should be backported to 1.9 and 1.8.	2019-06-02 11:30:07 +02:00
Alexander Liu	2a54bb74cd	MEDIUM: connection: Upstream SOCKS4 proxy support Have "socks4" and "check-via-socks4" server keyword added. Implement handshake with SOCKS4 proxy server for tcp stream connection. See issue #82. I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found at "https://www.openssh.com/txt/socks4.protocol". Please reference to it. [wt: for now connecting to the SOCKS4 proxy over unix sockets is not supported, and mixing IPv4/IPv6 is discouraged; indeed, the control layer is unique for a connection and will be used both for connecting and for target address manipulation. As such it may for example report incorrect destination addresses in logs if the proxy is reached over IPv6]	2019-05-31 17:24:06 +02:00
Olivier Houchard	cfbb3e6560	MEDIUM: tasks: Get rid of active_tasks_mask. Remove the active_tasks_mask variable, we can deduce if we've work to do by other means, and it is costly to maintain. Instead, introduce a new function, thread_has_tasks(), that returns non-zero if there's tasks scheduled for the thread, zero otherwise.	2019-05-29 21:53:37 +02:00
Olivier Houchard	250031e444	MEDIUM: sessions: Introduce session flags. Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when we use NTLM authentication, and we should reuse the last connection. This should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST. This should be backported to 1.9.	2019-05-29 15:41:47 +02:00
Willy Tarreau	ef28dc11e3	MINOR: task: turn the WQ lock to an RW_LOCK For now it's exclusively used as a write lock though, thus it remains 100% equivalent to the spinlock it replaces.	2019-05-28 19:15:44 +02:00
Willy Tarreau	186e96ece0	MEDIUM: buffers: relax the buffer lock a little bit In lock profiles it's visible that there is a huge contention on the buffer lock. The reason is that when offer_buffers() is called, it systematically takes the lock before verifying if there is any waiter. However doing so doesn't protect against races since a waiter can happen just after we release the lock as well. Similarly in h2 we take the lock every time an h2c is going to be released, even without checking that the h2c belongs to a wait list. These two have now been addressed by verifying non-emptiness of the list prior to taking the lock.	2019-05-28 17:25:21 +02:00
Willy Tarreau	a8b2ce02b8	MINOR: activity: report the number of failed pool/buffer allocations Haproxy is designed to be able to continue to run even under very low memory conditions. However this can sometimes have a serious impact on performance that it hard to diagnose. Let's report counters of failed pool and buffer allocations per thread in show activity.	2019-05-28 17:25:21 +02:00
Willy Tarreau	2ae84e445d	MEDIUM: poller: separate the wait time from the wake events We have been abusing the do_poll()'s timeout for a while, making it zero whenever there is some known activity. The problem this poses is that it complicates activity diagnostic by incrementing the poll_exp field for each known activity. It also requires extra computations that could be avoided. This change passes a "wake" argument to say that the poller must not sleep. This simplifies the operations and allows one to differenciate expirations from activity.	2019-05-28 17:25:21 +02:00
Willy Tarreau	0a7ef02074	MINOR: htx: make htx_add_data() return the transmitted byte count In order to later allow htx_add_data() to transmit partial blocks and avoid defragmenting the buffer, we'll need to return the number of bytes consumed. This first modification makes the function do this and its callers take this into account. At the moment the function still works atomically so it returns either the block size or zero. However all call places have been adapted to consider any value between zero and the block size.	2019-05-28 14:48:59 +02:00

... 10 11 12 13 14 ...

4678 Commits