haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-10 17:17:06 +02:00

Author	SHA1	Message	Date
William Lallemand	90afe90681	MINOR: ssl/cli: update pointer to store in 'commit ssl cert' The crtlist_entry structure use a pointer to the store as key. That's a problem with the dynamic update of a certificate over the CLI, because it allocates a new ckch_store. So updating the pointers is needed. To achieve that, a linked list of the crtlist_entry was added in the ckch_store, so it's easy to iterate on this list to update the pointers. Another solution would have been to rework the system so we don't allocate a new ckch_store, but it requires a rework of the ckch code.	2020-03-31 12:32:17 +02:00
William Lallemand	fa8cf0c476	MINOR: ssl: store a ptr to crtlist in crtlist_entry Store a pointer to crtlist in crtlist_entry so we can re-insert a crtlist_entry in its crtlist ebpt after updating its key.	2020-03-31 12:32:17 +02:00
William Lallemand	23d61c00b9	MINOR: ssl: add a list of crtlist_entry in ckch_store When updating a ckch_store we may want to update its pointer in the crtlist_entry which use it. To do this, we need the list of the entries using the store.	2020-03-31 12:32:17 +02:00
William Lallemand	09bd5a0787	MINOR: ssl: use crtlist_free() upon error in directory loading Replace the manual cleaninp which is done in crtlist_load_cert_dir() by a call to the crtlist_free() function.	2020-03-31 12:32:17 +02:00
William Lallemand	4c68bba5c1	REORG: ssl: move some functions above crtlist_load_cert_dir() Move some function above crtlist_load_cert_dir() so crtlist_load_cert_dir() is at the right place, and crtlist_free() can be used inside.	2020-03-31 12:32:17 +02:00
William Lallemand	493983128b	BUG/MINOR: ssl: ckch_inst wrongly inserted in crtlist_entry The instances were wrongly inserted in the crtlist entries, all instances of a crt-list were inserted in the last crt-list entry. Which was kind of handy to free all instances upon error. Now that it's done correctly, the error path was changed, it must iterate on the entries and find the ckch_insts which were generated for this bind_conf. To avoid wasting time, it stops the iteration once it found the first unsuccessful generation.	2020-03-31 12:32:17 +02:00
William Lallemand	ad3c37b760	REORG: ssl: move SETCERT enum to ssl_sock.h Move the SETCERT enum at the right place to cleanup ssl_sock.c.	2020-03-31 12:32:17 +02:00
William Lallemand	79d31ec0d4	MINOR: ssl: add a list of bind_conf in struct crtlist In order to be able to add new certificate in a crt-list, we need the list of bind_conf that uses this crt-list so we can create a ckch_inst for each of them.	2020-03-31 12:32:17 +02:00
Olivier Houchard	079cb9af22	MEDIUM: connections: Revamp the way idle connections are killed The original algorithm always killed half the idle connections. This doesn't take into account the way the load can change. Instead, we now kill half of the exceeding connections (exceeding connection being the number of used + idle connections past the last maximum used connections reached). That way if we reach a peak, we will kill much less, and it'll slowly go back down when there's less usage.	2020-03-30 00:30:07 +02:00
Olivier Houchard	cf612a0457	MINOR: servers: Add a counter for the number of currently used connections. Add a counter to know the current number of used connections, as well as the max, this will be used later to refine the algorithm used to kill idle connections, based on current usage.	2020-03-30 00:30:01 +02:00
Jerome Magnin	824186bb08	MEDIUM: stream: support use-server rules with dynamic names With server-template was introduced the possibility to scale the number of servers in a backend without needing a configuration change and associated reload. On the other hand it became impractical to write use-server rules for these servers as they would only accept existing server labels as argument. This patch allows the use of log-format notation to describe targets of a use-server rules, such as in the example below: listen test bind *:1234 use-server %[hdr(srv)] if { hdr(srv) -m found } use-server s1 if { path / } server s1 127.0.0.1:18080 server s2 127.0.0.1:18081 If a use-server rule is applied because it was conditionned by an ACL returning true, but the target of the use-server rule cannot be resolved, no other use-server rule is evaluated and we fall back to load balancing. This feature was requested on the ML, and bumped with issue #563.	2020-03-29 09:55:10 +02:00
Jerome Magnin	eb421b2fe0	MINOR: listener: add so_name sample fetch Add a sample fetch for the name of a bind. This can be useful to take decisions when PROXY protocol is used and we can't rely on dst, such as the sample config below. defaults mode http listen bar bind 127.0.0.1:1111 server s1 127.0.1.1:1234 send-proxy listen foo bind 127.0.1.1:1234 name foo accept-proxy http-request return status 200 hdr dst %[dst] if { dst 127.0.1.1 }	2020-03-29 05:47:29 +02:00
Emmanuel Hocdet	1673977892	MINOR: ssl: skip self issued CA in cert chain for ssl_ctx First: self issued CA, aka root CA, is the enchor for chain validation, no need to send it, client must have it. HAProxy can skip it in ssl_ctx. Second: the main motivation to skip root CA in ssl_ctx is to be able to provide it in the chain without drawback. Use case is to provide issuer for ocsp without the need for .issuer and be able to share it in issuers-chain-path. This concerns all certificates without intermediate certificates. It's useless for BoringSSL, .issuer is ignored because ocsp bits doesn't need it.	2020-03-26 12:53:53 +01:00
Baptiste Assmann	37950c8d27	BUG/MEDIUM: dns: improper parsing of aditional records `13a9232ebc` introduced parsing of Additionnal DNS response section to pick up IP address when available. That said, this introduced a side effect for other query types (A and AAAA) leading to consider those responses invalid when parsing the Additional section. This patch avoids this situation by ensuring the Additional section is parsed only for SRV queries.	2020-03-26 12:43:36 +01:00
Baptiste Assmann	17ab79f07d	CLEANUP: remove obsolete comments This patch removes some old comments introduced by `13a9232ebc`. Those comments are related to issues already fixed.	2020-03-26 12:43:36 +01:00
Olivier Houchard	c3500c3ccd	MINOR: build: Fix build in mux_h1 We want to check if the input buffer contains data, not the connection. This should unbreak the build.	2020-03-25 17:06:16 +01:00
Olivier Houchard	69664419d2	BUG/MEDIUM: mux_h1: Process a new request if we already received it. In h1_detach(), if our input buffer isn't empty, don't just subscribe(), we may hold a new request, and there's nothing left to read. Instead, call h1_process() directly, so that a new stream is created. Failure to do so means if we received the new request to early, the connecetion will just hang, as it happens when using svn.	2020-03-25 12:38:40 +01:00
Frédéric Lécaille	87eacbb12f	BUG/MINOR: peers: Use after free of "peers" section. When a "peers" section has not any local peer, it is removed of the list of "peers" sections by check_config_validity(). But a stick-table which refers to a "peers" section stores a pointer to this peers section. These pointer must be reset to NULL value for each stick-table refering to such a "peers" section to prevent stktable_init() to start the peers frontend attached to the peers section dereferencing the invalid pointer. Furthemore this patch stops the peers frontend as this is done for other configurations invalidated by check_config_validity(). Thank you to Olivier D for having reported this issue with such a simple configuration file which made haproxy crash when started with -c option for configuration file validation. defaults mode http peers mypeers peer toto 127.0.0.1:1024 backend test stick-table type ip size 10k expire 1h store http_req_rate(1h) peers mypeers Must be backported to 2.1 and 2.0.	2020-03-24 20:49:38 +01:00
William Lallemand	3ef2d56530	BUG/MINOR: peers: avoid an infinite loop with peers_fe is NULL Fix an infinite loop which was added in an attempt to fix #558. If the peers_fe is NULL, it will loop forever. Must be backported with `a2cfd7e` as far as 1.8.	2020-03-24 16:45:53 +01:00
William Lallemand	a2cfd7e356	BUG/MINOR: peers: init bind_proc to 1 if it wasn't initialized Tim reported that in master-worker mode, if a stick-table is declared but not used in the configuration, its associated peers listener won't bind. This problem is due the fact that the master-worker and the daemon mode, depend on the bind_proc field of the peers proxy to disable the listener. Unfortunately the bind_proc is left to 0 if no stick-table were used in the configuration, stopping the listener on all processes. This fixes sets the bind_proc to the first process if it wasn't initialized. Should fix bug #558. Should be backported as far as 1.8.	2020-03-24 16:18:15 +01:00
Emmanuel Hocdet	4fed93eb72	MINOR: ssl: rework add cert chain to CTX to be libssl independent SSL_CTX_set1_chain is used for openssl >= 1.0.2 and a loop with SSL_CTX_add_extra_chain_cert for openssl < 1.0.2. SSL_CTX_add_extra_chain_cert exist with openssl >= 1.0.2 and can be used for all openssl version (is new name is SSL_CTX_add0_chain_cert). This patch use SSL_CTX_add_extra_chain_cert to remove any #ifdef for compatibilty. In addition sk_X509_num()/sk_X509_value() replace sk_X509_shift() to extract CA from chain, as it is used in others part of the code.	2020-03-24 14:46:01 +01:00
Emmanuel Hocdet	ef87e0a3da	CLEANUP: ssl: rename ssl_get_issuer_chain to ssl_get0_issuer_chain Rename ssl_get_issuer_chain to ssl_get0_issuer_chain to be consistent with openssl >= 1.0.2 API.	2020-03-23 15:35:39 +01:00
Emmanuel Hocdet	f4f14eacd3	BUG/MINOR: ssl: memory leak when find_chain is NULL This bug was introduced by `85888573` "BUG/MEDIUM: ssl: chain must be initialized with sk_X509_new_null()". No need to set find_chain with sk_X509_new_null(), use find_chain conditionally to fix issue #516. This bug was referenced by issue #559. [wla: fix some alignment/indentation issue]	2020-03-23 13:10:10 +01:00
Willy Tarreau	95abd5be9f	CLEANUP: haproxy/threads: don't check global_tasks_mask twice In run_thread_poll_loop() we test both for (global_tasks_mask & tid_bit) and thread_has_tasks(), but the former is useless since this test is already part of the latter.	2020-03-23 09:33:32 +01:00
Willy Tarreau	4f46a354e6	BUG/MINOR: haproxy/threads: close a possible race in soft-stop detection Commit `4b3f27b` ("BUG/MINOR: haproxy/threads: try to make all threads leave together") improved the soft-stop synchronization but it left a small race open because it looks at tasks_run_queue, which can drop to zero then back to one while another thread picks the task from the run queue to insert it into the tasklet_list. The risk is very low but not null. In addition the condition didn't consider the possible presence of signals in the queue. This patch moves the stopping detection just after the "wake" calculation which already takes care of the various queues' sizes and signals. It avoids needlessly duplicating these tests. The bug was discovered during a code review but will probably never be observed. This fix may be backported to 2.1 and 2.0 along with the commit above.	2020-03-23 09:27:28 +01:00
Olivier Houchard	199d4fade4	MINOR: muxes: Note that we can't usee a connection when added to the srv idle. In the various muxes, add a comment documenting that once srv_add_to_idle_list() got called, any thread may pick that conenction up, so it is unsafe to access the mux context/the connection, the only thing we can do is returning.	2020-03-22 23:25:51 +01:00
Olivier Houchard	3c49c1bd5c	BUG/MEDIUM: h1: Make sure we subscribe before going into idle list. In h1_detach(), make sure we subscribe before we call srv_add_to_idle_list(), not after. As soon as srv_add_to_idle_list() is called, and it is put in an idle list, another thread can take it, and we're no longer allowed to subscribe. This fixes a race condition when another thread grabs a connection as soon as it is put, the original owner would subscribe, and thus the new thread would fail to do so, and to activate polling.	2020-03-22 20:05:59 +01:00
William Lallemand	18eeb8e815	BUG/MINOR: ssl/cli: fix a potential NULL dereference Fix a potential NULL dereference in "show ssl cert" when we can't allocate the <out> trash buffer. This patch creates a new label so we could jump without trying to do the ci_putchk in this case. This bug was introduced by `ea987ed` ("MINOR: ssl/cli: 'new ssl cert' command"). 2.2 only. This bug was referenced by issue #556.	2020-03-20 14:49:25 +01:00
Olivier Houchard	c0caac2cc8	BUG/MINOR: connections: Make sure we free the connection on failure. In connect_server(), make sure we properly free a newly created connection if we somehow fail, and it has not yet been attached to a conn_stream, or it would lead to a memory leak. This should appease coverity for backend.c, as reported in inssue #556. This should be backported to 2.1, 2.0 and 1.9	2020-03-20 14:35:07 +01:00
William Lallemand	67b991d370	BUG/MINOR: ssl/cli: free BIO upon error in 'show ssl cert' Fix a memory leak that could happen upon a "show ssl cert" if notBefore: or notAfter: failed to extract its ASN1 string. Introduced by `d4f946c` ("MINOR: ssl/cli: 'show ssl cert' give information on the certificates"). 2.2 only.	2020-03-20 14:22:35 +01:00
Olivier Houchard	e4ba0d4fc6	BUG/MEDIUM: build: Fix compilation by spelling decl correctly. Fix build on architectures for which double-width CAS isn't implemented by spelling __decl_rwlock correctly.	2020-03-20 11:03:38 +01:00
William Lallemand	3c516fc989	BUG/MINOR: ssl: crtlist_dup_filters() must return NULL with fcount == 0 crtlist_dup_filters() must return a NULL ptr if the fcount number is 0. This bug was introduced by `2954c47` ("MEDIUM: ssl: allow crt-list caching").	2020-03-20 10:10:25 +01:00
Tim Duesterhus	2445f8d4ec	BUG/MINOR: ssl: Correctly add the 1 for the sentinel to the number of elements In `crtlist_dup_filters()` add the `1` to the number of elements instead of the size of a single element. This bug was introduced in commit `2954c478eb`, which is 2.2+. No backport needed.	2020-03-20 09:43:53 +01:00
Tim Duesterhus	8c12025a7d	BUG/MINOR: ssl: Do not free garbage pointers on memory allocation failure In `ckch_inst_sni_ctx_to_sni_filters` use `calloc()` to allocate the filter array. When the function fails to allocate memory for a single entry the whole array will be `free()`d using free_sni_filters(). With the previous `malloc()` the pointers for entries after the failing allocation could possibly be a garbage value. This bug was introduced in commit `38df1c8006`, which is 2.2+. No backport needed.	2020-03-20 09:36:20 +01:00
Olivier Houchard	fdc7ee2173	BUG/MEDIUM: connections: Don't forget to decrement idle connection counters. In conn_backend_get(), when we manage to get an idle connection from the current thread's pool, don't forget to decrement the idle connection counters, or we may end up not reusing connections when we could, and/or killing connections when we shouldn't.	2020-03-19 23:56:08 +01:00
Olivier Houchard	b3397367dc	MEDIUM: connections: Kill connections even if we are reusing one. In connect_server(), if we notice we have more file descriptors opened than we should, there's no reason not to close a connection just because we're reusing one, so do it anyway.	2020-03-19 22:07:34 +01:00
Olivier Houchard	a41bb0b6c4	MEDIUM: mux_fcgi: Implement the takeover() method. Implement a takeover() method in the mux_fcgi, so that other threads may take an idle connection over if they need it.	2020-03-19 22:07:34 +01:00
Olivier Houchard	cd4159f039	MEDIUM: mux_h2: Implement the takeover() method. Implement a takeover() method in the mux_h2, so that other threads may take an idle connection over if they need it.	2020-03-19 22:07:34 +01:00
Olivier Houchard	f12ca9f8f1	MEDIUM: mux_h1: Implement the takeover() method. Implement a takeover() method in the mux_h1, so that other threads may take an idle connection over if they need it.	2020-03-19 22:07:34 +01:00
Olivier Houchard	566df309c6	MEDIUM: connections: Attempt to get idle connections from other threads. In connect_server(), if we no longer have any idle connections for the current thread, attempt to use the new "takeover" mux method to steal a connection from another thread. This should have no impact right now, given no mux implements it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	d2489e00b0	MINOR: connections: Add a flag to know if we're in the safe or idle list. Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one know we are in the safe list, or the idle list.	2020-03-19 22:07:33 +01:00
Olivier Houchard	f0d4dff25c	MINOR: connections: Make the "list" element a struct mt_list instead of list. Make the "list" element a struct mt_list, and explicitely use list_from_mt_list to get a struct list * where it is used as such, so that mt_list_for_each_entry will be usable with it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	8851664293	MINOR: fd: Implement fd_takeover(). Implement a new function, fd_takeover(), that lets you become the thread responsible for the fd. On architectures that do not have a double-width CAS, use a global rwlock. fd_set_running() was also changed to be able to compete with fd_takeover(), either using a dooble-width CAS on both running_mask and thread_mask, or by claiming a reader on the global rwlock. This extra operation should not have any measurable impact on modern architectures where threading is relevant.	2020-03-19 22:07:33 +01:00
Olivier Houchard	dc2f2753e9	MEDIUM: servers: Split the connections into idle, safe, and available. Revamp the server connection lists. We know have 3 lists : - idle_conns, which contains idling connections - safe_conns, which contains idling connections that are safe to use even for the first request - available_conns, which contains connections that are not idling, but can still accept new streams (those are HTTP/2 or fastcgi, and are always considered safe).	2020-03-19 22:07:33 +01:00
Olivier Houchard	2444aa5b66	MEDIUM: sessions: Don't be responsible for connections anymore. Make it so sessions are not responsible for connection anymore, except for connections that are private, and thus can't be shared, otherwise, as soon as a request is done, the session will just add the connection to the orphan connections pool. This will break http-reuse safe, but it is expected to be fixed later.	2020-03-19 22:07:33 +01:00
William Lallemand	59c16fc2cb	MINOR: ssl/cli: show certificate status in 'show ssl cert' Display the status of the certificate in 'show ssl cert'. Example: Status: Empty Status: Unused Status: Used	2020-03-19 20:36:13 +01:00
William Lallemand	ea987ed78a	MINOR: ssl/cli: 'new ssl cert' command The CLI command "new ssl cert" allows one to create a new certificate store in memory. It can be filed with "set ssl cert" and "commit ssl cert". This patch also made a small change in "show ssl cert" to handle an empty certificate store. Multi-certificate bundles are not supported since they will probably be removed soon. This feature alone is useless since there is no way to associate the store to a crt-list yet. Example: $ echo "new ssl cert foobar.pem" \| socat /tmp/sock1 - New empty certificate store 'foobar.pem'! $ printf "set ssl cert foobar.pem <<\n$(cat localhost.pem.rsa)\n\n" \| socat /tmp/sock1 - Transaction created for certificate foobar.pem! $ echo "commit ssl cert foobar.pem" \| socat /tmp/sock1 - Committing foobar.pem Success! $ echo "show ssl cert foobar.pem" \| socat /tmp/sock1 - Filename: foobar.pem [...]	2020-03-19 17:44:41 +01:00
Olivier Houchard	899fb8abdc	MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc. The flush_lock was introduced, mostly to be sure that pool_gc() will never dereference a pointer that has been free'd. __pool_get_first() was acquiring the lock to, the fear was that otherwise that pointer could get free'd later, and then pool_gc() would attempt to dereference it. However, that can not happen, because the only functions that can free a pointer, when using lockless pools, are pool_gc() and pool_flush(), and as long as those two are mutually exclusive, nobody will be able to free the pointer while pool_gc() attempts to access it. So change the flush_lock to a spinlock, and don't bother acquire/release it in __pool_get_first(), that way callers of __pool_get_first() won't have to wait while the pool is flushed. The worst that can happen is we call __pool_refill_alloc() while the pool is getting flushed, and memory can get allocated just to be free'd. This may help with github issue #552 This may be backported to 2.1, 2.0 and 1.9.	2020-03-18 15:55:35 +01:00
Olivier Houchard	b0198cc413	BUG/MEDIUM: wdt: Don't ignore WDTSIG and DEBUGSIG in __signal_process_queue(). When running __signal_process_queue(), we ignore most signals. We can't, however, ignore WDTSIG and DEBUGSIG, otherwise that thread may end up waiting for another one that could hold a glibc lock, while the other thread wait for this one to enter debug_handler(). So make sure WDTSIG and DEBUGSIG aren't ignored, if they are defined. This probably explains the watchdog deadlock described in github issue This should be backported to 2.1, 2.0 and 1.9.	2020-03-18 13:10:05 +01:00
Olivier Houchard	de01ea9878	MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h. Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into types/signal.h, so that we can access them in another file. We need those definition to avoid blocking those signals when running __signal_process_queue(). This should be backported to 2.1, 2.0 and 1.9.	2020-03-18 13:07:19 +01:00
Tim Duesterhus	b584b4475b	BUG/MINOR: pattern: Do not pass len = 0 to calloc() The behavior of calloc() when being passed `0` as `nelem` is implementation defined. It may return a NULL pointer. Avoid this issue by checking before allocating. While doing so adjust the local integer variables that are used to refer to memory offsets to `size_t`. This issue was introced in commit `f91ac19299`. This patch should be backported together with that commit.	2020-03-18 05:17:28 +01:00
William Lallemand	a64593c80d	BUG/MINOR: ssl: memleak of struct crtlist_entry There is a memleak of the entry structure in crtlist_load_cert_dir(), in the case we can't stat the file, or this is not a regular file. Let's move the entry allocation so it's done after these tests. Fix issue #551.	2020-03-17 20:28:06 +01:00
Olivier Houchard	c62d9ab7cb	MINOR: tasks: Provide the tasklet to the callback. When tasklet were introduced, it has been decided not to provide the tasklet to the callback, but NULL instead. While it may have been reasonable back then, maybe to be able to differentiate a task from a tasklet from the callback, it also means that we can't access the tasklet from the handler if the context provided can't be trusted. As no handler is shared between a task and a tasklet, and there are now other means of distinguishing between task and tasklet, just pass the tasklet pointer too. This may be backported to 2.1, 2.0 and 1.9 if needed.	2020-03-17 18:52:33 +01:00
William Lallemand	909086ea61	BUG/MINOR: ssl: memory leak in crtlist_parse_file() A memory leak happens in an error case when ckchs_load_cert_file() returns NULL in crtlist_parse_file(). This bug was introduced by commit `2954c47` ("MEDIUM: ssl: allow crt-list caching") This patch fixes bug #551.	2020-03-17 16:57:34 +01:00
Olivier Houchard	a7bf573520	MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock. In the struct fdtab, introduce a new mask, running_mask. Each thread should add its bit before using the fd. Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just spin as long as the mask is non-zero, to be sure we access the data exclusively. fd_set_running_excl() spins until the mask is 0, fd_set_running() just adds the thread bit, and fd_clr_running() removes it.	2020-03-17 15:30:07 +01:00
William Lallemand	2ea1b49832	BUG/MINOR: ssl/cli: free the trash chunk in dump_crtlist Free the trash chunk after dumping the crt-lists. Introduced by `a6ffd5b` ("MINOR: ssl/cli: show/dump ssl crt-list").	2020-03-17 15:30:05 +01:00
William Lallemand	a6ffd5bf8a	MINOR: ssl/cli: show/dump ssl crt-list Implement 2 new commands on the CLI: show ssl crt-list [<filename>]: Without a specified filename, display the list of crt-lists used by the configuration. If a filename is specified, it will displays the content of this crt-list, with a line identifier at the beginning of each line. This output must not be used as a crt-list file. dump ssl crt-list <filename>: Dump the content of a crt-list, the output can be used as a crt-list file. Note: It currently displays the default ssl-min-ver and ssl-max-ver which are potentialy not in the original file.	2020-03-17 14:59:37 +01:00
Olivier Houchard	a48e7ece48	MINOR: mux_pt: Don't try to remove the connection from the idle list. Don't bother trying to remove the connection from the idle list, as the only connections the mux_pt handles are now the TCP-mode connections, and those are never added to the idle list.	2020-03-17 13:38:18 +01:00
Olivier Houchard	7fa5562190	MINOR: fd: Use a separate lock for logs instead of abusing the fd lock. Introduce a new spinlock, log_lock, and use it instead of abusing the FD lock.	2020-03-17 13:38:09 +01:00
Kevin Zhu	079f808741	BUG/MEDIUM: spoe: dup agent's engine_id string from trash.area The agent's engine_id forgot to dup from trash, all engine_ids point to the same address "&trash.area", the engine_id changed at run time and will double free when release agents and trash. This bug was introduced by the commit `ee3bcddef` ("MINOR: tools: add a generic function to generate UUIDs"). No backport is needed, this is 2.2-dev.	2020-03-16 17:35:30 +01:00
William Lallemand	83918e2ef1	BUG/MINOR: ssl: can't open directories anymore The commit `6be66ec` ("MINOR: ssl: directories are loaded like crt-list") broke the directory loading of the certificates. The <crtlist> wasn't filled by the crtlist_load_cert_dir() function. And the entries were not correctly initialized. Leading to a segfault during startup.	2020-03-16 17:29:10 +01:00
William Lallemand	6be66ec7a9	MINOR: ssl: directories are loaded like crt-list Generate a directory cache with the crtlist and crtlist_entry structures. With this new model, directories are a special case of the crt-lists. A directory is a crt-list which allows only one occurence of each file, without SSL configuration (ssl_bind_conf) and without filters.	2020-03-16 16:23:44 +01:00
William Lallemand	2954c478eb	MEDIUM: ssl: allow crt-list caching The crtlist structure defines a crt-list in the HAProxy configuration. It contains crtlist_entry structures which are the lines in a crt-list file. crt-list are now loaded in memory using crtlist and crtlist_entry structures. The file is read only once. The generation algorithm changed a little bit, new ckch instances are generated from the crtlist structures, instead of being generated during the file loading. The loading function was split in two, one that loads and caches the crt-list and certificates, and one that looks for a crt-list and creates the ckch instances. Filters are also stored in crtlist_entry->filters as a char ** so we can generate the sni_ctx again if needed. I won't be needed anymore to parse the sni_ctx to do that. A crtlist_entry stores the list of all ckch_inst that were generated from this entry.	2020-03-16 16:18:49 +01:00
William Lallemand	24bde43eab	MINOR: ssl: pass ckch_inst to ssl_sock_load_ckchs() Pass a pointer to the struct ckch_inst to the ssl_sock_load_ckchs() function so we can manipulate the ckch_inst from ssl_sock_load_cert_list_file() and ssl_sock_load_cert().	2020-03-16 16:18:49 +01:00
William Lallemand	06b22a8fba	REORG: ssl: move ssl_sock_load_cert() Move the ssl_sock_load_cert() at the right place.	2020-03-16 16:18:49 +01:00
Emeric Brun	70de43b77b	BUG/MEDIUM: peers: resync ended with RESYNC_PARTIAL in wrong cases. This bug was introduced with peers.c code re-work (`7d0ceeec80`): "struct peer" flags are mistakenly checked instead of "struct peers" flags to check the resync status of the local peer. The issue was reported here: https://github.com/haproxy/haproxy/issues/545 This bug affects all branches >= 2.0 and should be backported.	2020-03-16 11:32:47 +01:00
Tim Duesterhus	2b7f6c22d8	CLEANUP: connection: Stop directly setting an ist's .ptr Instead replace the complete `ist` by the value returned from `ist2`. This was noticed during review of issue #549.	2020-03-14 18:31:58 +01:00
Willy Tarreau	2e8ab6b560	MINOR: use DISGUISE() everywhere we deliberately want to ignore a result It's more generic and versatile than the previous shut_your_big_mouth_gcc() that was used to silence annoying warnings as it's not limited to ignoring syscalls returns only. This allows us to get rid of the aforementioned function and the shut_your_big_mouth_gcc_int variable, that started to look ugly in multi-threaded environments.	2020-03-14 11:04:49 +01:00
Balvinder Singh Rawat	def595e2df	DOC: correct typo in alert message about rspirep This message comes when we run: haproxy -c -V -f /etc/haproxy/haproxy.cfg [ALERT] 072/233727 (30865) : parsing [/etc/haproxy/haproxy.cfg:34] : The 'rspirep' directive is not supported anymore sionce HAProxy 2.1. Use 'http-response replace-header' instead. [ALERT] 072/233727 (30865) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg [ALERT] 072/233727 (30865) : Fatal errors found in configuration.	2020-03-14 10:14:41 +01:00
Ilya Shipitsin	77e3b4a2c4	CLEANUP: assorted typo fixes in the code and comments These are mostly comments in the code. A few error messages were fixed and are of low enough importance not to deserve a backport. Some regtests were also fixed.	2020-03-14 09:42:07 +01:00
Tim Duesterhus	a8692f3fe0	CLEANUP: connection: Add blank line after declarations in PP handling This adds the missing blank lines in `make_proxy_line_v2` and `conn_recv_proxy`. It also adjusts the type of the temporary variable used for the return value of `recv` to be `ssize_t` instead of `int`.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	cf6e0c8a83	MEDIUM: proxy_protocol: Support sending unique IDs using PPv2 This patch adds the `unique-id` option to `proxy-v2-options`. If this option is set a unique ID will be generated based on the `unique-id-format` while sending the proxy protocol v2 header and stored as the unique id for the first stream of the connection. This feature is meant to be used in `tcp` mode. It works on HTTP mode, but might result in inconsistent unique IDs for the first request on a keep-alive connection, because the unique ID for the first stream is generated earlier than the others. Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made available in TCP mode.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	d1b15b6e9b	MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections This patch reads a proxy protocol v2 provided unique ID and makes it available using the `fc_pp_unique_id` fetch.	2020-03-13 17:25:23 +01:00
Willy Tarreau	4b3f27b67f	BUG/MINOR: haproxy/threads: try to make all threads leave together There's a small issue with soft stop combined with the incoming connection load balancing. A thread may dispatch a connection to another one at the moment stopping=1 is set, and the second one could stop by seeing (jobs - unstoppable_jobs) == 0 in run_poll_loop(), without ever picking these connections from the queue. This is visible in that it may occasionally cause a connection drop on reload since no remaining thread will ever pick that connection anymore. In order to address this, this patch adds a stopping_thread_mask variable by which threads acknowledge their willingness to stop when their runqueue is empty. And all threads will only stop at this moment, so that if finally some late work arrives in the thread's queue, it still has a chance to process it. This should be backported to 2.1 and 2.0.	2020-03-12 19:17:19 +01:00
Willy Tarreau	a7da5e8dd0	BUG/MINOR: listener/mq: do not dispatch connections to remote threads when stopping When stopping there is a risk that other threads are already in the process of stopping, so let's not add new work in their queue and instead keep the incoming connection local. This should be backported to 2.1 and 2.0.	2020-03-12 19:10:29 +01:00
Willy Tarreau	f8ea00e05e	BUG/MINOR: haproxy: always initialize sleeping_thread_mask Surprizingly the variable was never initialized, though on most platforms it's zeroed at boot, and it is relatively harmless anyway since in the worst case the bits are updated around poll(). This was introduced by commit `79321b95a8` and needs to be backported as far as 1.9.	2020-03-12 19:09:46 +01:00
Olivier Houchard	51d9339d04	BUG/MEDIUM: pools: Always update free_list in pool_gc(). In pool_gc(), when we're not using lockless pool, always update free_list, and read from it the next element to free. As we now unlock the pool while we're freeing the item, another thread could have updated free_list in our back. Not doing so could lead to segfaults when pool_gc() is called. This should be backported to 2.1.	2020-03-12 19:07:10 +01:00
Olivier Houchard	bdb00c5db9	BUG/MEDIUM: connections: Don't assume the connection has a valid session. Don't assume the connection always has a valid session in "owner". Instead, attempt to retrieve the session from the stream, and modify the error snapshot code to not assume we always have a session, or the proxy for the other end.	2020-03-12 15:39:37 +01:00
Willy Tarreau	1544c14c57	BUG/MEDIUM: random: align the state on 2*64 bits for ARM64 x86_64 and ARM64 do support the double-word atomic CAS. However on ARM it must be done only on aligned data. The random generator makes use of such double-word atomic CAS when available but didn't enforce alignment, which causes ARM64 to crash early in the startup since commit `52bf839` ("BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG"). This commit just unconditionally aligns the arrays. It must be backported to all branches where the commit above is backported (likely till 2.0).	2020-03-12 00:34:22 +01:00
Olivier Houchard	8676514d4e	MINOR: servers: Kill priv_conns. Remove the list of private connections from server, it has been largely unused, we only inserted connections in it, but we would never actually use it.	2020-03-11 19:20:01 +01:00
Willy Tarreau	304e17eb88	MEDIUM: init: always try to push the FD limit when maxconn is set from -m When a maximum memory setting is passed to haproxy and maxconn is not set and ulimit-n is not set, it is expected that maxconn will be set to the highest value permitted by this memory setting, possibly affecting the FD limit. When maxconn was changed to be deduced from the current process's FD limit, the automatic setting above was partially lost because it now remains limited to the current FD limit in addition to being limited to the memory usage. For unprivileged processes it does not change anything, but for privileged processes the difference is important. Indeed, the previous behavior ensured that the new FD limit could be enforced on the process as long as the user had the privilege to do so. Now this does not happen anymore, and some people rely on this for automatic sizing in VM environments. This patch implements the ability to verify if the setting will be enforceable on the process or not. First it computes maxconn based on the memory limits alone, then checks if the process is willing to accept them, otherwise tries again by respecting the process' hard limit. Thanks to this we now have the best of the pre-2.0 behavior and the current one, in that privileged users will be able to get as high a maxconn as they need just based on the memory limit, while unprivileged users will still get as high a setting as permitted by the intersection of the memory limit and the process' FD limit. Ideally, after some observation period, this patch along with the previous one "MINOR: init: move the maxsock calculation code to compute_ideal_maxsock()" should be backported to 2.1 and 2.0. Thanks to Baptiste for raising the issue.	2020-03-10 18:08:11 +01:00
Willy Tarreau	a409f30d09	MINOR: init: move the maxsock calculation code to compute_ideal_maxsock() The maxsock value is currently derived from global.maxconn and a few other settings, some of which also depend on global.maxconn. This makes it difficult to check if a limit is already too high or not during the maxconn automatic sizing. Let's move this code into a new function, compute_ideal_maxsock() which now takes a maxconn in argument. It performs the same operations and returns the maxsock value if global.maxconn were to be set to that value. It now replaces the previous code to compute maxsock.	2020-03-10 18:08:11 +01:00
Olivier Houchard	6c96fc166c	BUG/MINOR: buffers: MT_LIST_DEL_SAFE() expects the temporary pointer. When calling MT_LIST_DEL_SAFE(), give him the temporary pointer "tmpelt", as that's what is expected. We want to be able to set that pointer to NULL, to let other parts of the code know we deleted an element.	2020-03-10 17:44:40 +01:00
William Lallemand	2d232c2131	CLEANUP: ssl: separate the directory loading in a new function In order to store and cache the directory loading, the directory loading was separated from ssl_sock_load_cert() and put in a new function ssl_sock_load_cert_dir() to be more readable. This patch only splits the function in two.	2020-03-10 15:55:22 +01:00
Willy Tarreau	0627815f70	BUILD: wdt: only test for SI_TKILL when compiled with thread support SI_TKILL is not necessarily defined on older systems and is used only with the pthread_kill() call a few lines below, so it should also be subject to the USE_THREAD condition.	2020-03-10 09:26:17 +01:00
Willy Tarreau	62af9c83f9	BUILD: make dladdr1 depend on glibc version and not __USE_GNU Technically speaking the call was implemented in glibc 2.3 so we must rely on this and not on __USE_GNU which is an internal define of glibc to track use of GNU_SOURCE.	2020-03-10 07:53:10 +01:00
Willy Tarreau	06c63aec95	CLEANUP: remove support for USE_MY_SPLICE The splice() syscall has been supported in glibc since version 2.5 issued in 2006 and is present on supported systems so there's no need for having our own arch-specific syscall definitions anymore.	2020-03-10 07:23:41 +01:00
Willy Tarreau	3858b122a6	CLEANUP: remove support for USE_MY_EPOLL This was made to support epoll on patched 2.4 kernels, and on early 2.6 using alternative libcs thanks to the arch-specific syscall definitions. All the features we support have been around since 2.6.2 and present in glibc since 2.3.2, neither of which are found in field anymore. Let's simply drop this and use epoll normally.	2020-03-10 07:08:10 +01:00
Willy Tarreau	618ac6ea52	CLEANUP: drop support for USE_MY_ACCEPT4 The accept4() syscall has been present for a while now, there is no more reason for maintaining our own arch-specific syscall implementation for systems lacking it in libc but having it in the kernel.	2020-03-10 07:02:46 +01:00
Willy Tarreau	c3e926bf3b	CLEANUP: remove support for Linux i686 vsyscalls This was introduced 10 years ago to squeeze a few CPU cycles per syscall on 32-bit x86 machines and was already quite old by then, requiring to explicitly enable support for this in the kernel. We don't even know if it still builds, let alone if it works at all on recent kernels! Let's completely drop this now.	2020-03-10 06:55:52 +01:00
William Lallemand	6763016866	BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"), HAProxy generates a list of filters based on the sni_ctx in memory. However it's not always relevant, sometimes no filters were configured and the CN/SAN in the new certificate are not the same. This patch fixes the issue by using a flag filters in the ckch_inst, so we are able to know if there were filters or not. In the late case it uses the CN/SAN of the new certificate to generate the sni_ctx. note: filters are still only used in the crt-list atm.	2020-03-09 17:32:04 +01:00
Willy Tarreau	ee3bcddef7	MINOR: tools: add a generic function to generate UUIDs We currently have two UUID generation functions, one for the sample fetch and the other one in the SPOE filter. Both were a bit complicated since they were made to support random() implementations returning an arbitrary number of bits, and were throwing away 33 bits every 64. Now we don't need this anymore, so let's have a generic function consuming 64 bits at once and use it as appropriate.	2020-03-08 18:04:16 +01:00
Willy Tarreau	aa8bbc12dd	MINOR: sample: make all bits random on the rand() sample fetch The rand() sample fetch supports being limited to a certain range, but it only uses 31 bits and scales them as requested, which means that when the requested output range is larger than 31 bits, the least significant one is not random and may even be constant. Let's make use of the whole 32 bits now that we have access ot them.	2020-03-08 18:04:16 +01:00
Willy Tarreau	5a6d3e797e	BUG/MINOR: checks/threads: use ha_random() and not rand() In order to honor spread_checks we currently call rand() which is not thread safe and which must never turn its internal state to zero. This is not thread safe, let's use ha_random() instead. This is a complement to commimt `52bf839394` ("BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG") and may be backported with it.	2020-03-08 17:56:47 +01:00
Willy Tarreau	b9f54c5592	MINOR: backend: use a single call to ha_random32() for the random LB algo For the random LB algorithm we need a random 32-bit hashing key that used to be made of two calls to random(). Now we can simply perform a single call to ha_random32() and get rid of the useless operations.	2020-03-08 17:31:39 +01:00
Willy Tarreau	52bf839394	BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG This is the replacement of failed attempt to add thread safety and per-process sequences of random numbers initally tried with commit `1c306aa84d` ("BUG/MEDIUM: random: implement per-thread and per-process random sequences"). This new version takes a completely different approach and doesn't try to work around the horrible OS-specific and non-portable random API anymore. Instead it implements "xoroshiro128*", a reputedly high quality random number generator, which is one of the many variants of xorshift, which passes all quality tests and which is described here: http://prng.di.unimi.it/ While not cryptographically secure, it is fast and features a 2^128-1 period. It supports fast jumps allowing to cut the period into smaller non-overlapping sequences, which we use here to support up to 2^32 processes each having their own, non-overlapping sequence of 2^96 numbers (~710^28). This is enough to provide 1 billion randoms per second and per process for 2200 billion years. The implementation was made thread-safe either by using a double 64-bit CAS on platforms supporting it (x86_64, aarch64) or by using a local lock for the time needed to perform the shift operations. This ensures that all threads pick numbers from the same pool so that it is not needed to assign per-thread ranges. For processes we use the fast jump method to advance the sequence by 2^96 for each process. Before this patch, the following config: global nbproc 8 frontend f bind :4445 mode http log stdout format raw daemon log-format "%[uuid] %pid" redirect location / Would produce this output: a4d0ad64-2645-4b74-b894-48acce0669af 12987 a4d0ad64-2645-4b74-b894-48acce0669af 12992 a4d0ad64-2645-4b74-b894-48acce0669af 12986 a4d0ad64-2645-4b74-b894-48acce0669af 12988 a4d0ad64-2645-4b74-b894-48acce0669af 12991 a4d0ad64-2645-4b74-b894-48acce0669af 12989 a4d0ad64-2645-4b74-b894-48acce0669af 12990 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986 (...) And now produces: f94b29b3-da74-4e03-a0c5-a532c635bad9 13011 47470c02-4862-4c33-80e7-a952899570e5 13014 86332123-539a-47bf-853f-8c8ea8b2a2b5 13013 8f9efa99-3143-47b2-83cf-d618c8dea711 13012 3cc0f5c7-d790-496b-8d39-bec77647af5b 13015 3ec64915-8f95-4374-9e66-e777dc8791e0 13009 0f9bf894-dcde-408c-b094-6e0bb3255452 13011 49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014 e23f6f2e-35c5-4433-a294-b790ab902653 13012 There are multiple benefits to using this method. First, it doesn't depend anymore on a non-portable API. Second it's thread safe. Third it is fast and more proven than any hack we could attempt to try to work around the deficiencies of the various implementations around. This commit depends on previous patches "MINOR: tools: add 64-bit rotate operators" and "BUG/MEDIUM: random: initialize the random pool a bit better", all of which will need to be backported at least as far as version 2.0. It doesn't require to backport the build fixes for circular include files dependecy anymore.	2020-03-08 10:09:02 +01:00
Willy Tarreau	0fbf28a05b	Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences" This reverts commit `1c306aa84d`. It breaks the build on all non-glibc platforms. I got confused by the man page (which possibly is the most confusing man page I've ever read about a standard libc function) and mistakenly understood that random_r was portable, especially since it appears in latest freebsd source as well but not in released versions, and with a slightly different API :-/ We need to find a different solution with a fallback. Among the possibilities, we may reintroduce this one with a fallback relying on locking around the standard functions, keeping fingers crossed for no other library function to call them in parallel, or we may also provide our own PRNG, which is not necessarily more difficult than working around the totally broken up design of the portable API.	2020-03-07 11:24:39 +01:00
Willy Tarreau	1c306aa84d	BUG/MEDIUM: random: implement per-thread and per-process random sequences As mentioned in previous patch, the random number generator was never made thread-safe, which used not to be a problem for health checks spreading, until the uuid sample fetch function appeared. Currently it is possible for two threads or processes to produce exactly the same UUID. In fact it's extremely likely that this will happen for processes, as can be seen with this config: global nbproc 8 frontend f bind :4445 mode http log stdout daemon format raw log-format "%[uuid] %pid" redirect location / It typically produces this log: 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646 07764439-c24d-4e6f-a5a6-0138be59e7a8 30645 07764439-c24d-4e6f-a5a6-0138be59e7a8 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643 07764439-c24d-4e6f-a5a6-0138be59e7a8 30646 b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642 07764439-c24d-4e6f-a5a6-0138be59e7a8 30642 What this patch does is to use a distinct per-thread and per-process seed to make sure the same sequences will not appear, and will then extend these seeds by "burning" a number of randoms that depends on the global random seed, the thread ID and the process ID. This adds roughly 20 extra bits of randomness, resulting in 52 bits total per thread and per process. It only takes a few milliseconds to burn these randoms and given that threads start with a different seed, we know they will not catch each other. So these random extra bits are essentially added to ensure randomness between boots and cluster instances. This replaces all uses of random() with ha_random() which uses the thread-local state. This must be backported as far as 2.0 or any version having the UUID sample-fetch function since it's the main victim here. It's important to note that this patch, in addition to depending on the previous one "BUG/MEDIUM: init: initialize the random pool a bit better", also depends on the preceeding build fixes to address a circular dependency issue in the include files that prevented it from building. Part or all of these patches may need to be backported or adapted as well.	2020-03-07 06:11:15 +01:00
Willy Tarreau	6c3a681bd6	BUG/MEDIUM: random: initialize the random pool a bit better Since the UUID sample fetch was created, some people noticed that in certain virtualized environments they manage to get exact same UUIDs on different instances started exactly at the same moment. It turns out that the randoms were only initialized to spread the health checks originally, not to provide "clean" randoms. This patch changes this and collects more randomness from various sources, including existing randoms, /dev/urandom when available, RAND_bytes() when OpenSSL is available, as well as the timing for such operations, then applies a SHA1 on all this to keep a 160 bits random seed available, 32 of which are passed to srandom(). It's worth mentioning that there's no clean way to pass more than 32 bits to srandom() as even initstate() provides an opaque state that must absolutely not be tampered with since known implementations contain state information. At least this allows to have up to 4 billion different sequences from the boot, which is not that bad. Note that the thread safety was still not addressed, which is another issue for another patch. This must be backported to all versions containing the UUID sample fetch function, i.e. as far as 2.0.	2020-03-07 06:11:11 +01:00
Christopher Faulet	49c2a707ce	BUG/MINOR: http-rules: Abort transaction when a redirect is applied on response In the same way than for the request, when a redirect rule is applied the transction is aborted. This must be done returning HTTP_RULE_RES_ABRT from http_res_get_intercept_rule() function. No backport needed because on previous versions, the action return values are not handled the same way.	2020-03-06 15:44:38 +01:00
Christopher Faulet	ddc005ae57	BUG/MINOR: rules: Increment be_counters if backend is assigned for a silent-drop Backend counters must be incremented only if a backend was already assigned to the stream (when the stream exists). Otherwise, it means we are still on the frontend side. This patch may be backported as far as 1.6.	2020-03-06 15:36:04 +01:00
Christopher Faulet	f573ba2033	BUG/MINOR: rules: Return ACT_RET_ABRT when a silent-drop action is executed When an action interrupts a transaction, returning a response or not, it must return the ACT_RET_ABRT value and not ACT_RET_STOP. ACT_RET_STOP is reserved to stop the processing of the current ruleset. No backport needed because on previous versions, the action return values are not handled the same way.	2020-03-06 15:36:04 +01:00
Christopher Faulet	177f480f2c	BUG/MINOR: rules: Preserve FLT_END analyzers on silent-drop action When at least a filter is attached to a stream, FLT_END analyzers must be preserved on request and response channels. This patch should be backported as far as 1.7.	2020-03-06 15:36:04 +01:00
Christopher Faulet	5e896510a8	MINOR: compression/filters: Initialize the comp filter when stream is created Since the HTX mode is the only mode to process HTTP messages, the stream is created for a uniq transaction. The keep-alive is handled at the mux level. So, the compression filter can be initialized when the stream is created and released with the stream. Concretly, .channel_start_analyze and .channel_end_analyze callback functions are replaced by .attach and .detach ones. With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END analysers for the compression filter.	2020-03-06 15:36:04 +01:00
Christopher Faulet	65554e1b95	MINOR: cache/filters: Initialize the cache filter when stream is created Since the HTX mode is the only mode to process HTTP messages, the stream is created for a uniq transaction. The keep-alive is handled at the mux level. So, the cache filter can be initialized when the stream is created and released with the stream. Concretly, .channel_start_analyze and .channel_end_analyze callback functions are replaced by .attach and .detach ones. With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END analysers for the cache filter.	2020-03-06 15:36:04 +01:00
Christopher Faulet	d4a824e533	BUG/MINOR: http-rules: Fix a typo in the reject action function A typo was introduced by the commit `c5bb5a0f2` ("BUG/MINOR: http-rules: Preserve FLT_END analyzers on reject action"). This patch must be backported with the commit `c5bb5a0f2`.	2020-03-06 15:36:04 +01:00
Christopher Faulet	c5bb5a0f2b	BUG/MINOR: http-rules: Preserve FLT_END analyzers on reject action When at least a filter is attached to a stream, FLT_END analyzers must be preserved on request and response channels. This patch should be backported as far as 1.8.	2020-03-06 14:13:00 +01:00
Christopher Faulet	90d22a88cb	BUG/MINOR: http-rules: Return ACT_RET_ABRT to abort a transaction When an action interrupts a transaction, returning a response or not, it must return the ACT_RET_ABRT value and not ACT_RET_DONE. ACT_RET_DONE is reserved to stop the processing on the current channel but some analysers may still be active. When ACT_RET_ABRT is returned, all analysers are removed, except FLT_END if it is set. No backport needed because on previous verions, the action return value was not handled the same way. It is stated in the comment the return action returns ACT_RET_ABRT on success. It it the right code to use to abort a transaction. ACT_RET_DONE must be used when the message processing must be stopped. This does not means the transaction is interrupted. No backport needed.	2020-03-06 14:13:00 +01:00
Christopher Faulet	bc275a9e44	BUG/MINOR: lua: Init the lua wake_time value before calling a lua function The wake_time of a lua context is now always set to TICK_ETERNITY when the context is initialized and when everytime the execution of the lua stack is started. It is mandatory to not set arbitrary wake_time when an action yields. No backport needed.	2020-03-06 14:13:00 +01:00
Christopher Faulet	501465d94b	MINOR: lua: Rename hlua_action_wake_time() to hlua_set_wake_time() This function does not depends on the action class. So use a more generic name. It will be easier to bind it on another class if necessary.	2020-03-06 14:13:00 +01:00
Christopher Faulet	d8f0e073dd	MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY This flag was used in some internal functions to be sure the current stream is able to handle HTTP content. It was introduced when the legacy HTTP code was still there. Now, It is possible to rely on stream's flags to be sure we have an HTX stream. So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is replaced by a call to the IS_HTX_STRM() macro. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	d31c7b322c	MINOR: lua: Stop using lua txn in hlua_http_del_hdr() and hlua_http_add_hdr() In these functions, the lua txn was not used. So it can be removed from the function argument list. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	d1914aaa03	MINOR: lua: Stop using the lua txn in hlua_http_rep_hdr() In this function, the lua txn was only used to retrieve the stream. But it can be retieve from the HTTP message, using its channel pointer. So, the lua txn can be removed from the function argument list. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	9d1332bbf4	MINOR: lua: Stop using the lua txn in hlua_http_get_headers() In this function, the lua txn was only used to test if the HTTP transaction is defined. But it is always used in a context where it is true. So, the lua txn can be removed from the function argument list. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	2ac9ba2a1c	MINOR: lua: Add function to know if a channel is a response one It is now possible to call Channel.is_resp(chn) method to know if a channel is a response channel or not.	2020-03-06 14:13:00 +01:00
Christopher Faulet	0ec740eaee	BUG/MINOR: lua: Ignore the reserve to know if a channel is full or not The Lua function Channel.is_full() should not take care of the reserve because it is not called from a producer (an applet for instance). From an action, it is allowed to overwrite the buffer reserve. This patch should be backported as far as 1.7. But it must be adapted for 1.8 and lower because there is no HTX on these versions.	2020-03-06 14:13:00 +01:00
Christopher Faulet	4ad7310399	BUG/MINOR: lua: Abort when txn:done() is called from a Lua action When a lua action aborts a transaction calling txn:done() function, the action must return ACT_RET_ABRT instead of ACT_RET_DONE. It is mandatory to abort the message analysis. This patch must be backported everywhere the commit `7716cdf45` ("MINOR: lua: Get the action return code on the stack when an action finishes") was backported. For now, no backport needed.	2020-03-06 14:12:59 +01:00
Christopher Faulet	e58c0002ff	BUG/MINOR: http-ana: Reset request analysers on a response side error When an error occurred on the response side, request analysers must be reset. At this stage, only AN_REQ_HTTP_XFER_BODY analyser remains, and possibly AN_REQ_FLT_END, if at least one filter is attached to the stream. So it is safe to remove the AN_REQ_HTTP_XFER_BODY analyser. An error was already handled and a response was already returned to the client (or it was at least scheduled to be sent). So there is no reason to continue to process the request payload. It may cause some troubles for the filters because when an error occurred, data from the request buffer are truncated. This patch must be backported as far as 1.9, for the HTX part only. I don't know if the legacy HTTP code is affected.	2020-03-06 14:12:59 +01:00
Christopher Faulet	e6a62bf796	BUG/MEDIUM: compression/filters: Fix loop on HTX blocks compressing the payload During the payload filtering, the offset is relative to the head of the HTX message and not its first index. This index is the position of the first block to (re)start the HTTP analysis. It must be used during HTTP analysis but not during the payload forwarding. So, from the compression filter point of view, when we loop on the HTX blocks to compress the response payload, we must start from the head of the HTX message. To ease the loop, we use the function htx_find_offset(). This patch must be backported as far as 2.0. It depends on the commit "MINOR: htx: Add a function to return a block at a specific an offset". So this one must be backported first.	2020-03-06 14:12:59 +01:00
Christopher Faulet	497c759558	BUG/MEDIUM: cache/filters: Fix loop on HTX blocks caching the response payload During the payload filtering, the offset is relative to the head of the HTX message and not its first index. This index is the position of the first block to (re)start the HTTP analysis. It must be used during HTTP analysis but not during the payload forwarding. So, from the cache point of view, when we loop on the HTX blocks to cache the response payload, we must start from the head of the HTX message. To ease the loop, we use the function htx_find_offset(). This patch must be backported as far as 2.0. It depends on the commit "MINOR: htx: Add a function to return a block at a specific an offset". So this one must be backported first.	2020-03-06 14:12:59 +01:00
Christopher Faulet	81340d7b53	BUG/MINOR: filters: Forward everything if no data filters are called If a filter enable the data filtering, in TCP or in HTTP, but it does not defined the corresponding callback function (so http_payload() or tcp_payload()), it will be ignored. If all configured data filter do the same, we must be sure to forward everything. Otherwise nothing will be forwarded at all. This patch must be forwarded as far as 1.9.	2020-03-06 14:12:59 +01:00
Christopher Faulet	c50ee0b3b4	BUG/MINOR: filters: Use filter offset to decude the amount of forwarded data When the tcp or http payload is filtered, it is important to use the filter offset to decude the amount of forwarded data because this offset may change during the call to the callback function. So we should not rely on a local variable defined before this call. For now, existing HAproxy filters don't change this offset, so this bug may only affect external filters. This patch must be forwarded as far as 1.9.	2020-03-06 14:12:59 +01:00
Christopher Faulet	24598a499f	MINOR: flt_trace: Use htx_find_offset() to get the available payload length The trace_get_htx_datalen() function now uses htx_find_offset() to get the payload length, ie. the length of consecutives DATA blocks.	2020-03-06 14:12:59 +01:00
Christopher Faulet	bb76aa4d37	MINOR: htx: Use htx_find_offset() to truncate an HTX message The htx_truncate() function now uses htx_find_offset() to find the first block to start the truncation.	2020-03-06 14:12:59 +01:00
Christopher Faulet	1cdceb9365	MINOR: htx: Add a function to return a block at a specific offset The htx_find_offset() function may be used to look for a block at a specific offset in an HTX message, starting from the message head. A compound result is returned, an htx_ret structure, with the found block and the position of the offset in the block. If the offset is ouside of the HTX message, the returned block is NULL.	2020-03-06 14:12:59 +01:00
Tim Duesterhus	ba837ec367	CLEANUP: proxy_protocol: Use `size_t` when parsing TLVs Change `int` to `size_t` for consistency.	2020-03-06 11:16:19 +01:00
Tim Duesterhus	488ee7fb6e	BUG/MAJOR: proxy_protocol: Properly validate TLV lengths This patch fixes PROXYv2 parsing when the payload of the TCP connection is fused with the PROXYv2 header within a single recv() call. Previously HAProxy ignored the PROXYv2 header length when attempting to parse the TLV, possibly interpreting the first byte of the payload as a TLV type. This patch adds proper validation. It ensures that: 1. TLV parsing stops when the end of the PROXYv2 header is reached. 2. TLV lengths cannot exceed the PROXYv2 header length. 3. The PROXYv2 header ends together with the last TLV, not allowing for "stray bytes" to be ignored. A reg-test was added to ensure proper behavior. This patch tries to find the sweat spot between a small and easily backportable one, and a cleaner one that's more easily adaptable to older versions, hence why it merges the "if" and "while" blocks which causes a reindent of the whole block. It should be used as-is for versions 1.9 to 2.1, the block about PP2_TYPE_AUTHORITY should be dropped for 2.0 and the block about CRC32C should be dropped for 1.8. This bug was introduced when TLV parsing was added. This happened in commit `b3e54fe387`. This commit was first released with HAProxy 1.6-dev1. A similar issue was fixed in commit `7209c204bd`. This patch must be backported to HAProxy 1.6+.	2020-03-06 11:11:22 +01:00
Willy Tarreau	b1beaa302c	BUG/MINOR: init: make the automatic maxconn consider the max of soft/hard limits James Stroehmann reported something working as documented but that can be considered as a regression in the way the automatic maxconn is calculated from the process' limits : https://www.mail-archive.com/haproxy@formilux.org/msg36523.html The purpose of the changes in 2.0 was to have maxconn default to the highest possible value permitted to the user based on the ulimit -n setting, however the calculation starts from the soft limit, which can be lower than what users were allowed to with previous versions where the default value of 2000 would force a higher ulimit -n as long as it fitted in the hard limit. Usually this is not noticeable if the user changes the limits, because quite commonly setting a new value restricts both the soft and hard values. Let's instead always use the max between the hard and soft limits, as we know these values are permitted. This was tried on the following setup: $ cat ulimit-n.cfg global stats socket /tmp/sock1 level admin $ ulimit -n 1024 Before the change the limits would show like this: $ socat - /tmp/sock1 <<< "show info" \| grep -im2 ^Max Maxsock: 1023 Maxconn: 489 After the change the limits are now much better and more in line with the default settings in earlier versions: $ socat - /tmp/sock1 <<< "show info" \| grep -im2 ^Max Maxsock: 4095 Maxconn: 2025 The difference becomes even more obvious when running moderately large configs with hundreds of checked servers and hundreds of listeners: $ cat ulimit-n.cfg global stats socket /tmp/sock1 level admin listen l bind :10000-10300 server-template srv- 300 0.0.0.0 check disabled Before After Maxsock 1024 4096 Maxconn 189 1725 This issue is tagged as minor since a trivial config change fixes it, but it would help new users to have it backported as far as 2.0.	2020-03-06 10:49:55 +01:00
Carl Henrik Lunde	f91ac19299	OPTIM: startup: fast unique_id allocation for acl. pattern_finalize_config() uses an inefficient algorithm which is a problem with very large configuration files. This affects startup, and therefore reload time. When haproxy is deployed as a router in a Kubernetes cluster the generated configuration file may be large and reloads are frequently occuring, which makes this a significant issue. The old algorithm is O(n^2) * allocate missing uids - O(n^2) * sort linked list - O(n^2) The new algorithm is O(n log n): * find the user allocated uids - O(n) * store them for efficient lookup - O(n log n) * allocate missing uids - n times O(log n) * sort all uids - O(n log n) * convert back to linked list - O(n) Performance examples, startup time in seconds: pat_refs old new 1000 0.02 0.01 10000 2.1 0.04 20000 12.3 0.07 30000 27.9 0.10 40000 52.5 0.14 50000 77.5 0.17 Please backport to 1.8, 2.0 and 2.1.	2020-03-06 08:11:58 +01:00
Tim Duesterhus	a17e66289c	MEDIUM: stream: Make the `unique_id` member of `struct stream` a `struct ist` The `unique_id` member of `struct stream` now is a `struct ist`.	2020-03-05 20:21:58 +01:00
Tim Duesterhus	0643b0e7e6	MINOR: proxy: Make `header_unique_id` a `struct ist` The `header_unique_id` member of `struct proxy` now is a `struct ist`.	2020-03-05 19:58:22 +01:00
Tim Duesterhus	ed5263739b	CLEANUP: Use `isttest()` and `istfree()` This adjusts a few locations to make use of `isttest()` and `istfree()`.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	e296d3e5f0	MINOR: ist: Add `int isttest(const struct ist)` `isttest` returns whether the `.ptr` is non-null.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	241e29ef9c	MINOR: ist: Add `IST_NULL` macro `IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and `.len = 0`.	2020-03-05 19:52:07 +01:00
Willy Tarreau	6cbe62b858	MINOR: debug: add CLI command "debug dev write" to write an arbitrary size This command is used to produce an arbitrary amount of data on the output. It can be used to test the CLI's state machine as well as the internal parts related to applets an I/O. A typical test consists in asking for all sizes from 0 to 16384: $ (echo "prompt;expert-mode on";for i in {0..16384}; do echo "debug dev write $i"; done) \| socat - /tmp/sock1 \| wc -c 134258738 A better test would consist in first waiting for the response before sending a new request. This command is not restricted to the admin since it's harmless.	2020-03-05 17:20:15 +01:00
Willy Tarreau	d04a2a6654	BUG/MINOR: ssl-sock: do not return an uninitialized pointer in ckch_inst_sni_ctx_to_sni_filters There's a build error reported here: `c9c6cdbf9c/checks` It's just caused by an inconditional assignment of tmp_filter to *sni_filter without having been initialized, though it's harmless because this return pointer is not used when fcount is NULL, which is the only case where this happens. No backport is needed as this was brought today by commit `38df1c8006` ("MINOR: ssl/cli: support crt-list filters").	2020-03-05 16:26:12 +01:00
William Lallemand	cfca1422c7	MINOR: ssl: reach a ckch_store from a sni_ctx It was only possible to go down from the ckch_store to the sni_ctx but not to go up from the sni_ctx to the ckch_store. To allow that, 2 pointers were added: - a ckch_inst pointer in the struct sni_ctx - a ckckh_store pointer in the struct ckch_inst	2020-03-05 11:28:42 +01:00
William Lallemand	38df1c8006	MINOR: ssl/cli: support crt-list filters Generate a list of the previous filters when updating a certificate which use filters in crt-list. Then pass this list to the function generating the sni_ctx during the commit. This feature allows the update of the crt-list certificates which uses the filters with "set ssl cert". This function could be probably replaced by creating a new ckch_inst_new_load_store() function which take the previous sni_ctx list as an argument instead of the char **sni_filter, avoiding the allocation/copy during runtime for each filter. But since are still handling the multi-cert bundles, it's better this way to avoid code duplication.	2020-03-05 11:27:53 +01:00
Willy Tarreau	f4629a5346	BUG/MINOR: connection/debug: do not enforce !event_type on subscribe() anymore When building with DEBUG_STRICT, there are still some BUG_ON(events&event_type) in the subscribe() code which are not welcome anymore since we explicitly permit to wake the caller up on readiness. This causes some regtests to fail since `2c1f37d353` ("OPTIM: mux-h1: subscribe rather than waking up at a few other places") when built with this option. No backport is needed, this is 2.2-dev.	2020-03-05 07:46:33 +01:00
Tim Duesterhus	2825b4b0ca	MINOR: stream: Use stream_generate_unique_id This patch replaces the ad-hoc generation of stream's `unique_id` values by calls to `stream_generate_unique_id`.	2020-03-05 07:23:00 +01:00
Tim Duesterhus	127a74dd48	MINOR: stream: Add stream_generate_unique_id function Currently unique IDs for a stream are generated using repetitive code in multiple locations, possibly allowing for inconsistent behavior.	2020-03-05 07:23:00 +01:00
Willy Tarreau	2c1f37d353	OPTIM: mux-h1: subscribe rather than waking up at a few other places This is another round of conversion from a blind tasklet_wakeup() to a more careful subscribe(). It has significantly improved the number of function calls per HTTP request (/?s=1k/t=20) : before after tasklet_wakeup: 3 2 conn_subscribe: 3 2 h1_iocb: 3 2 h1_process: 3 2 h1_parse_msg_hdrs: 4 3 h1_rcv_buf: 5 3 h1_send: 5 4 h1_subscribe: 2 1 h1_wake_stream_for_send: 5 4 http_wait_for_request: 2 1 process_stream: 3 2 si_cs_io_cb: 4 2 si_cs_process: 3 1 si_cs_rcv: 5 3 si_sync_send: 2 1 si_update_both: 2 1 stream_int_chk_rcv_conn: 3 2 stream_int_notify: 3 1 stream_release_buffers: 9 4	2020-03-04 19:29:12 +01:00
Willy Tarreau	6f95f6e111	OPTIM: connection: disable receiving on disabled events when the run queue is too high In order to save a lot on syscalls, we currently don't disable receiving on a file descriptor anymore if its handler was already woken up. But if the run queue is huge and the poller collects a lot of events, this causes excess wakeups which take CPU time which is not used to flush these tasklets. This patch simply considers the run queue size to decide whether or not to stop receiving. Tests show that by stopping receiving when the run queue reaches ~16 times its configured size, we can still hold maximal performance in extreme situations like maxpollevents=20k for runqueue_depth=2, and still totally avoid calling epoll_event under moderate load using default settings on keep-alive connections.	2020-03-04 19:29:12 +01:00
Willy Tarreau	8de5c4fa15	MEDIUM: connection: only call ->wake() for connect() without I/O We used to call ->wake() for any I/O event for which there was no subscriber. But this is a problem because this causes massive wake() storms since we disabled fd_stop_recv() to save syscalls. The only reason for the io_available condition is to detect that an asynchronous connect() just finished and will not be handled by any registered event handler. Since we now properly handle synchronous connects, we can detect this situation by the fact that we had a success on conn_fd_check() and no requested I/O took over.	2020-03-04 19:29:12 +01:00
Willy Tarreau	4c69cff438	MINOR: tcp/uxst/sockpair: only ask for I/O when really waiting for a connect() Now that the stream-interface properly handles synchonous connects, there is no more reason for subscribing and doing nothing.	2020-03-04 19:29:12 +01:00
Willy Tarreau	ada4c5806b	MEDIUM: stream-int: make sure to try to immediately validate the connection In the rare case of immediate connect() (unix sockets, socket pairs, and occasionally TCP over the loopback), it is counter-productive to subscribe for sending and then getting immediately back to process_stream() after having passed through si_cs_process() just to update the connection. We already know it is established and it doesn't have any handshake anymore so we just have to complete it and return to process_stream() with the stream_interface in the SI_ST_RDY state. In this case, process_stream will simply loop back to the beginning to synchronize the state and turn it to SI_ST_EST/ASS/CLO/TAR etc. This will save us from having to needlessly subscribe in the connect() code, something which in addition cannot work with edge-triggered pollers.	2020-03-04 19:29:12 +01:00
Willy Tarreau	667fefdc90	BUG/MEDIUM: connection: stop polling for sending when the event is ready With commit `065a025610` ("MEDIUM: connection: don't stop receiving events in the FD handler") we disabled a number of fd_stop_* in conn_fd_handler(), in order to wait for their respective handlers to deal with them. But it is not correct to do that for the send direction, as we may very well have nothing to send. This is visible when connecting in TCP mode to a server with no data to send, there's nobody anymore to disable the polling for the send direction. And it is logical, on the recv() path we know the system has data to deliver and that some code will be in charge of it. On the send direction we simply don't know if it was the result of a successful connect() or if there is still something to send. In any case we almost never fill the network buffer on a single send() after being woken up by the system, so disabling the FD immediately or much later will not change the number of operations. No backport is needed, this is 2.2-dev.	2020-03-04 19:29:12 +01:00
Willy Tarreau	109201fc5c	BUILD: tools: rely on __ELF__ not USE_DL to enable use of dladdr() The approach was wrong. USE_DL is for the makefile to know if it's required to append -ldl at link time. Some platforms do not need it (and in fact do not have it) yet they have a working dladdr(). The real condition is related to ELF. Given that due to Lua, all platforms that require -ldl already have USE_DL set, let's replace USE_DL with __ELF__ here and consider that dladdr is always needed on ELF, which basically is already the case.	2020-03-04 12:04:07 +01:00
Willy Tarreau	9133e48f2a	BUILD: tools: unbreak resolve_sym_name() on non-GNU platforms resolve_sym_name() doesn't build when USE_DL is set on non-GNU platforms because "Elf(W)" isn't defined. Since it's only used for dladdr1(), let's refactor all this so that we can completely ifdef out that part on other platforms. Now we have a separate function to perform the call depending on the platform and it only returns the size when available.	2020-03-04 12:04:07 +01:00
Willy Tarreau	a91b7946bd	MINOR: debug: dump the whole trace if we can't spot the starting point Instead of special-casing the use of the symbol resolving to decide whether to dump a partial or complete trace, let's simply start over and dump everything when we reach the end after having found nothing. It will be more robust against dirty traces as well.	2020-03-04 12:04:07 +01:00
Willy Tarreau	13faf16e1e	MINOR: debug: improve backtrace() on aarch64 and possibly other systems It happens that on aarch64 backtrace() only returns one entry (tested with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding the stack due to the risk of hitting a bad pointer. Here we can use may_access() to know when it's safe, so we can actually unwind the stack without taking risks. It happens that the faulting function (the one just after the signal handler) is not listed here, very likely because the signal handler uses a special stack and did not create a new frame. So this patch creates a new my_backtrace() function in standard.h that either calls backtrace() or does its own unrolling. The choice depends on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build target.	2020-03-04 12:04:07 +01:00
Willy Tarreau	cdd8074433	MINOR: debug: report the number of entries in the backtrace It's useful to get an indication of unresolved stuff or memory corruption to have the apparent depth of the stack trace in the output, especially if we dump nothing.	2020-03-04 12:02:27 +01:00
Willy Tarreau	e58114e0e5	MINOR: wdt: do not depend on USE_THREAD There is no reason for restricting the use of the watchdog to threads anymore, as it works perfectly without threads as well.	2020-03-04 12:02:27 +01:00
Willy Tarreau	d6f1966543	MEDIUM: wdt: fall back to CLOCK_REALTIME if CLOCK_THREAD_CPUTIME is not available At least FreeBSD has a fully functional CLOCK_THREAD_CPUTIME but it cannot create a timer on it. This is not a problem since our timer is only used to measure each thread's usage using now_cpu_time_thread(). So by just replacing this clock with CLOCK_REALTIME we allow such platforms to periodically call the wdt and check the thread's CPU usage. The consequence is that even on a totally idle system there will still be a few extra periodic wakeups, but the watchdog becomes usable there as well.	2020-03-04 12:02:27 +01:00
Willy Tarreau	7259fa2b89	BUG/MINOR: wdt: do not return an error when the watchdog couldn't be enabled On operating systems not supporting to create a timer on POSIX_THREAD_CPUTIME we emit a warning but we return an error so the process fails to start, which is absurd. Let's return a success once the warning is emitted instead. This may be backported to 2.1 and 2.0.	2020-03-04 12:02:27 +01:00
Emmanuel Hocdet	842e94ee06	MINOR: ssl: add "ca-verify-file" directive It's only available for bind line. "ca-verify-file" allows to separate CA certificates from "ca-file". CA names sent in server hello message is only compute from "ca-file". Typically, "ca-file" must be defined with intermediate certificates and "ca-verify-file" with certificates to ending the chain, like root CA. Fix issue #404.	2020-03-04 11:53:11 +01:00
Willy Tarreau	0214b45a61	MINOR: debug: call backtrace() once upon startup Calling backtrace() will access libgcc at runtime. We don't want to do it after the chroot, so let's perform a first call to have it ready in memory for later use.	2020-03-04 06:01:40 +01:00
Willy Tarreau	f5b4e064dc	MEDIUM: debug: add support for dumping backtraces of stuck threads When a panic() occurs due to a stuck thread, we'll try to dump a backtrace of this thread if the config directive USE_BACKTRACE is set (which is the case on linux+glibc). For this we use the backtrace() call provided by glibc and iterate the pointers through resolve_sym_name(). In order to minimize the output (which is limited to one buffer), we only do this for stuck threads, and we start the dump above ha_panic()/ha_thread_dump_all_to_trash(), and stop when meeting known points such as main/run_tasks_from_list/run_poll_loop. If enabled without USE_DL, the dump will be complete with no details except that pointers will all be given relative to main, which is still better than nothing. The new USE_BACKTRACE config option is enabled by default on glibc since it has been present for ages. When it is set, the export-dynamic linker option is enabled so that all non-static symbols are properly resolved.	2020-03-03 18:40:03 +01:00
Willy Tarreau	cf12f2ee66	MINOR: cli: make "show fd" rely on resolve_sym_name() This way we can drop all hard-coded iocb matching.	2020-03-03 18:19:04 +01:00
Willy Tarreau	2e89b0930b	MINOR: debug: use resolve_sym_name() to dump task handlers Now in "show threads", the task/tasklet handler will be resolved using this function, which will provide more detailed results and will still support offsets to main for unresolved symbols.	2020-03-03 18:19:04 +01:00
Willy Tarreau	eb8b1ca3eb	MINOR: tools: add resolve_sym_name() to resolve function pointers We use various hacks at a few places to try to identify known function pointers in debugging outputs (show threads & show fd). Let's centralize this into a new function dedicated to this. It already knows about the functions matched by "show threads" and "show fd", and when built with USE_DL, it can rely on dladdr1() to resolve other functions. There are some limitations, as static functions are not resolved, linking with -rdynamic is mandatory, and even then some functions will not necessarily appear. It's possible to do a better job by rebuilding the whole symbol table from the ELF headers in memory but it's less portable and the gains are still limited, so this solution remains a reasonable tradeoff.	2020-03-03 18:18:40 +01:00
Willy Tarreau	762fb3ec8e	MINOR: tools: add new function dump_addr_and_bytes() This function dumps <n> bytes from <addr> in hex form into buffer <buf> enclosed in brackets after the address itself, formatted on 14 chars including the "0x" prefix. This is meant to be used as a prefix for code areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: " It relies on may_access() to know if the bytes are dumpable, otherwise "--" is emitted. An optional prefix is supported.	2020-03-03 17:46:37 +01:00
Willy Tarreau	55a6c4f34d	BUILD: tools: remove obsolete and conflicting trace() from standard.c Since commit `4c2ae48375` ("MINOR: trace: implement a very basic trace() function") merged in 2.1, trace() is an inline function. It must not appear in standard.c anymore and may break build depending on includes. This can be backported to 2.1.	2020-03-03 17:46:37 +01:00
Willy Tarreau	27d00c0167	MINOR: task: export run_tasks_from_list This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	3ebd55ee51	MINOR: haproxy: export run_poll_loop This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1827845a3d	MINOR: haproxy: export main to ease access from debugger Better just export main instead of declaring it as extern, it's cleaner and may be usable elsewhere.	2020-03-03 15:26:10 +01:00
Willy Tarreau	82aafc4a0f	BUG/MEDIUM: debug: make the debug_handler check for the thread in threads_to_dump It happens that just sending the debug signal to the process makes on thread wait for its turn while nobody wants to dump. We need to at least verify that a dump was really requested for this thread. This can be backported to 2.1 and 2.0.	2020-03-03 08:31:34 +01:00
Willy Tarreau	516853f1cc	MINOR: debug: report the task handler's pointer relative to main Often in crash dumps we see unknown function pointers. Let's display them relative to main, that helps quite a lot figure the function from an executable, for example: (gdb) x/a main+645360 0x4c56a0 <h1_timeout_task>: 0x2e6666666666feeb This could be backported to 2.0.	2020-03-03 07:04:42 +01:00
Willy Tarreau	7d9421deca	MINOR: tools: make sure to correctly check the returned 'ms' in date2std_log In commit `4eee38a` ("BUILD/MINOR: tools: fix build warning in the date conversion functions") we added some return checks to shut build warnings but the last test is useless since the tested pointer is not updated by the last call to utoa_pad() used to convert the milliseconds. It turns out the original code from 2012 already skipped this part, probably in order to avoid the risk of seeing a millisecond field not belonging to the 0-999 range. Better keep the check and put the code into stricter shape. No backport is needed. This fixes issue #526.	2020-02-29 09:08:02 +01:00
Willy Tarreau	77e463f729	BUG/MINOR: arg: don't reject missing optional args Commit `80b53ffb1c` ("MEDIUM: arg: make make_arg_list() stop after its own arguments") changed the way we detect the empty list because we cannot stop by looking up the closing parenthesis anymore, thus for the first missing arg we have to enter the parsing loop again. And there, finding an empty arg means we go to the empty_err label, where it was not initially planned to handle this condition. This results in %[date()] to fail while %[date] works. Let's simply check if we've reached the minimally supported args there (it used to be done during the function entry). Thanks to J�r�me for reporting this issue. No backport is needed, this is 2.2-dev2+ only.	2020-02-28 16:41:29 +01:00
Willy Tarreau	493d9dc6ba	MEDIUM: mux-h1: do not blindly wake up the tasklet at end of request anymore Since commit "MEDIUM: connection: make the subscribe() call able to wakeup if ready" we have the guarantee that the tasklet will be woken up if subscribing to a connection for an even that's ready. Since we have too many tasklet_wakeup() calls in mux-h1, let's now use this property to improve the situation a bit. With this change, no syscall count changed, however the number of useless calls to some functions significantly went down. Here are the differences for the test below (100k req), in number of calls per request : $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 before after change note tasklet_wakeup: 3 1 -66% h1_io_cb: 4 3 -25% h1_send: 6.7 5.4 -19% h1_wake: 0.73 0.44 -39% h1_process: 4.7 3.4 -27% h1_wake_stream_for_send: 6.7 5.5 -18% si_cs_process 3.7 3.4 -7.8% conn_fd_handler 2.7 2.4 -10% raw_sock_to_buf: 4 2 -50% pool_free: 4 2 -50% from failed rx calls Note that the situation could be further improved by having muxes lazily subscribe to Rx events in case the FD is already being polled. However this requires deeper changes to implement a LAZY_RECV subscribe mode, and to replace the FD's active bit by 3 states representing the desired action to perform on the FD during the update, among NONE (no need to change), POLL (can't proceed without), and STOP (buffer full). This would only impact Rx since on Tx we know what we have to send. The savings to expect from this might be more visible with splicing and/or when dealing with many connections having long think times.	2020-02-28 16:17:09 +01:00
Willy Tarreau	065a025610	MEDIUM: connection: don't stop receiving events in the FD handler The remaining epoll_ctl() calls are exclusively caused by the disagreement between conn_fd_handler() and the mux receiving the data: the fd handler wants to stop after having woken up the tasklet, then the mux after receiving data wants to receive again. Given that they don't happen in the same poll loop when there are many FDs, this causes a lot of state changes. As suggested by Olivier, if the task is already scheduled for running, we don't need to disable the event because it's in the run queue, poll() cannot stop, and reporting it again will be harmless. What might happen however is that a sampling-based poller like epoll() would report many times the same event and has trouble getting others behind. But if it would happen, it would still indicate the run queue has plenty of pending operations, so it would in fact only displace the problem from the poller to the run queue, which doesn't seem to be worse (and in fact we do support priorities while the poller does not). By doing this change, the keep-alive test with 1k conns and 100k reqs completely gets rid of the per-request epoll_ctl changes, while still not causing extra recvfrom() : $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 200000 sendto 1 200000 recvfrom 1 10762 epoll_wait 1 3664 epoll_ctl 1 1999 recvfrom -1 In close mode, it didn't change anything, we're still in the optimal case (2 epoll per connection) : $ ./h1load -n 100000 -r 1 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 203764 epoll_ctl 1 200000 sendto 1 200000 recvfrom 1 6091 epoll_wait 1 2994 recvfrom -1	2020-02-28 16:17:09 +01:00
Willy Tarreau	7e59c0a5e1	MEDIUM: connection: make the subscribe() call able to wakeup if ready There's currently an internal API limitation at the connection layer regarding conn_subscribe(). We must not subscribe if we haven't yet met EAGAIN or such a condition, so we sometimes force ourselves to read in order to meet this condition and being allowed to call subscribe. But reading cannot always be done (e.g. at the end of a loop where we cannot afford to retrieve new data and start again) so we instead perform a tasklet_wakeup() of the requester's io_cb. This is what is done in mux_h1 for example. The problem with this is that it forces a new receive when we're not necessarily certain we need one. And if the FD is not ready and was already being polled, it's a useless wakeup. The current patch improves the connection-level subscribe() so that it really manipulates the polling if the FD is marked not-ready, but instead schedules the argument tasklet for a wakeup if the FD is ready. This guarantees we'll wake this tasklet up in any case once the FD is ready, either immediately or after polling. By doing so, a test on pure close mode shows we cut in half the number of epoll_ctl() calls and almost eliminate failed recvfrom(): $ ./h1load -n 100000 -r 1 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 before: 399464 epoll_ctl 1 200007 recvfrom 1 200000 sendto 1 100000 recvfrom -1 7508 epoll_wait 1 after: 205739 epoll_ctl 1 200000 sendto 1 200000 recvfrom 1 6084 epoll_wait 1 2651 recvfrom -1 On keep-alive there is no change however.	2020-02-28 16:17:09 +01:00
Willy Tarreau	8dd348c90c	MINOR: rawsock: always mark the FD not ready when we're certain it happens This partially reverts commit `1113116b4a` ("MEDIUM: raw-sock: remove obsolete calls to fd_{cant,cond,done}_{send,recv}") so that we can mark the FD not ready as required since commit `19bc201c9f` ("MEDIUM: connection: remove the intermediary polling state from the connection"). Indeed, with the removal of the latter we don't have any other reliable indication that the FD is blocked, which explains why there are so many EAGAIN in traces. It's worth noting that a short read or a short write are also reliable indicators of exhausted buffers and are even documented as such in the epoll man page in case of edge-triggered mode. That's why we also report the FD as blocked in such a case. With this change we completely got rid of EAGAIN in keep-alive tests, but they were expectedly transferred to epoll_ctl: $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 before: 266331 epoll_ctl 1 200000 sendto 1 200000 recvfrom 1 135757 recvfrom -1 8626 epoll_wait 1 after: 394865 epoll_ctl 1 200000 sendto 1 200000 recvfrom 1 10748 epoll_wait 1 1999 recvfrom -1	2020-02-28 16:17:09 +01:00
Christopher Faulet	b045bb221a	MINOR: mux-h1: Remove useless case-insensitive comparisons Header names from an HTX message are always in lower-case, so the comparison may be case-sensitive.	2020-02-28 10:49:09 +01:00
Christopher Faulet	3e1f7f4a39	BUG/MINOR: http-htx: Do case-insensive comparisons on Host header name When a header is added or modified, in http_add_header() or http_replace_header(), a comparison is performed on its name to know if it is the Host header and if the authority part of the uri must be updated or not. This comparision must be case-insensive. This patch should fix the issue #522. It must be backported to 2.1.	2020-02-28 10:49:09 +01:00
Lukas Tribus	81725b867c	BUG/MINOR: dns: ignore trailing dot As per issue #435 a hostname with a trailing dot confuses our DNS code, as for a zero length DNS label we emit a null-byte. This change makes us ignore the zero length label instead. Must be backported to 1.8.	2020-02-28 10:26:29 +01:00
William Lallemand	858885737c	BUG/MEDIUM: ssl: chain must be initialized with sk_X509_new_null() Even when there isn't a chain, it must be initialized to a empty X509 structure with sk_X509_new_null(). This patch fixes a segfault which appears with older versions of the SSL libs (openssl 0.9.8, libressl 2.8.3...) because X509_chain_up_ref() does not check the pointer. This bug was introduced by `b90d2cb` ("MINOR: ssl: resolve issuers chain later"). Should fix issue #516.	2020-02-27 14:48:35 +01:00
Tim Duesterhus	530408f976	BUG/MINOR: sample: Make sure to return stable IDs in the unique-id fetch Previously when the `unique-id-format` contained non-deterministic parts, such as the `uuid` fetch each use of the `unique-id` fetch would generate a new unique ID, replacing the old one. The following configuration shows the error: global log stdout format short daemon listen test log global log-format "%ID" unique-id-format %{+X}o\ TEST-%[uuid] mode http bind *:8080 http-response set-header A %[unique-id] http-response set-header B %[unique-id] server example example.com:80 Without the patch the contents of the `A` and `B` response header would differ. This bug was introduced in commit `f4011ddcf5`, which was first released with HAProxy 1.7-dev3. This fix should be backported to HAProxy 1.7+.	2020-02-27 03:50:10 +01:00
Willy Tarreau	55c5399846	MINOR: epoll: always initialize all of epoll_event to please valgrind valgrind complains that epoll_ctl() uses an epoll_event in which we have only set the part we use from the data field (i.e. the fd). Tests show that pre-initializing the struct in the stack doesn't have a measurable impact so let's do it.	2020-02-26 14:36:27 +01:00
Willy Tarreau	c1563e5474	MINOR: wdt: always clear sigev_value to make valgrind happy In issue #471 it was reported that valgrind sometimes complains about timer_create() being called with uninitialized bytes. These are in fact the bits from sigev_value.sival_ptr that are not part of sival_int that are tagged as such, as valgrind has no way to know we're using the int instead of the ptr in the union. It's cheap to initialize the field so let's do it.	2020-02-26 14:05:20 +01:00
Willy Tarreau	fd2658c0c6	BUG/MINOR: h2: reject again empty :path pseudo-headers Since commit `92919f7fd5` ("MEDIUM: h2: make the request parser rebuild a complete URI") we make sure to rebuild a complete URI. Unfortunately the test for an empty :path pseudo-header that is mandated by #8.1.2.3 appened to be performed on the URI before this patch, which is never empty anymore after being rebuilt, causing h2spec to complain : 8. HTTP Message Exchanges 8.1. HTTP Request/Response Exchange 8.1.2. HTTP Header Fields 8.1.2.3. Request Pseudo-Header Fields - 1: Sends a HEADERS frame with empty ":path" pseudo-header field -> The endpoint MUST respond with a stream error of type PROTOCOL_ERROR. Expected: GOAWAY Frame (Error Code: PROTOCOL_ERROR) RST_STREAM Frame (Error Code: PROTOCOL_ERROR) Connection closed Actual: DATA Frame (length:0, flags:0x01, stream_id:1) It's worth noting that this error doesn't trigger when calling h2spec with a timeout as some scripts do, which explains why it wasn't detected after the patch above. This fixes one half of issue #471 and should be backported to 2.1.	2020-02-26 13:56:24 +01:00
Emmanuel Hocdet	cf8cf6c5cd	MINOR: ssl/cli: "show ssl cert" command should print the "Chain Filename:" When the issuers chain of a certificate is picked from the "issuers-chain-path" tree, "ssl show cert" prints it.	2020-02-26 13:11:59 +01:00
Emmanuel Hocdet	6f507c7c5d	MINOR: ssl: resolve ocsp_issuer later The goal is to use the ckch to store data from PEM files or <payload> and only for that. This patch adresses the ckch->ocsp_issuer case. It finds issuers chain if no chain is present in the ckch in ssl_sock_put_ckch_into_ctx(), filling the ocsp_issuer from the chain must be done after. It changes the way '.issuer' is managed: it tries to load '.issuer' in ckch->ocsp_issuer first and then look for the issuer in the chain later (in ssl_sock_load_ocsp() ). "ssl-load-extra-files" without the "issuer" parameter can negate extra '.issuer' file check.	2020-02-26 13:11:59 +01:00
Emmanuel Hocdet	b90d2cbc42	MINOR: ssl: resolve issuers chain later The goal is to use the ckch to store data from a loaded PEM file or a <payload> and only for that. This patch addresses the ckch->chain case. Looking for the issuers chain, if no chain is present in the ckch, can be done in ssl_sock_put_ckch_into_ctx(). This way it is possible to know the origin of the certificate chain without an extra state.	2020-02-26 13:06:04 +01:00
Emmanuel Hocdet	75a7aa13da	MINOR: ssl: move find certificate chain code to its own function New function ssl_get_issuer_chain(cert) to find an issuer_chain entry from "issers-chain-path" tree.	2020-02-26 12:48:47 +01:00
Willy Tarreau	2104659cd5	MEDIUM: buffer: remove the buffer_wq lock This lock was only needed to protect the buffer_wq list, but now we have the mt_list for this. This patch simply turns the buffer_wq list to an mt_list and gets rid of the lock. It's worth noting that the whole buffer_wait thing still looks totally wrong especially in a threaded context: the wakeup_cb() callback is called synchronously from any thread and may end up calling some connection code that was not expected to run on a given thread. The whole thing should probably be reworked to use tasklets instead and be a bit more centralized.	2020-02-26 10:39:36 +01:00
William Lallemand	e0f3fd5b4c	CLEANUP: ssl: move issuer_chain tree and definition Move the cert_issuer_tree outside the global_ssl structure since it's not a configuration variable. And move the declaration of the issuer_chain structure in types/ssl_sock.h	2020-02-25 15:06:40 +01:00
William Lallemand	a90e593a7a	MINOR: ssl/cli: reorder 'show ssl cert' output Reorder the 'show ssl cert' output so it's easier to see if the whole chain is correct. For a chain to be correct, an "Issuer" line must have the same content as the next "Subject" line. Example: Subject: /C=FR/ST=Paris/O=HAProxy Test Certificate/CN=test.haproxy.local Issuer: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Root CA/CN=root.haproxy.local	2020-02-25 14:17:50 +01:00
William Lallemand	bb7288a9f5	MINOR: ssl/cli: 'show ssl cert'displays the issuer in the chain For each certificate in the chain, displays the issuer, so it's easy to know if the chain is right. Also rename "Chain" to "Chain Subject". Example: Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Root CA/CN=root.haproxy.local	2020-02-25 14:17:44 +01:00
William Lallemand	35f4a9dd8c	MINOR: ssl/cli: 'show ssl cert' displays the chain Display the subject of each certificate contained in the chain in the output of "show ssl cert <filename>". Each subjects are on a unique line prefixed by "Chain: " Example: Chain: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local Chain: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local	2020-02-25 12:02:51 +01:00
Willy Tarreau	1b85785bc2	MINOR: config: mark global.debug as deprecated This directive has never made any sense and has already caused trouble by forcing the process to stay in foreground during the boot process. Let's emit a warning mentioning it's deprecated and will be removed in 2.3.	2020-02-25 11:28:58 +01:00
Willy Tarreau	7f26391bc5	BUG/MINOR: connection: make sure to correctly tag local PROXY connections As reported in issue #511, when sending an outgoing local connection (e.g. health check) we must set the "local" tag and not a "proxy" tag. The issue comes from historic support on v1 which required to steal the address on the outgoing connection for such ones, creating confusion in the v2 code which believes it sees the incoming connection. In order not to risk to break existing setups which might rely on seeing the LB's address in the connection's source field, let's just change the connection type from proxy to local and keep the addresses. The protocol spec states that for local, the addresses must be ignored anyway. This problem has always existed, this can be backported as far as 1.5, though it's probably not a good idea to change such setups, thus maybe 2.0 would be more reasonable.	2020-02-25 10:31:37 +01:00
Willy Tarreau	1ac83af560	CLEANUP: connection: use read_u32() instead of a cast in the netscaler parser The netscaler protocol parser used to involve a few casts from char to (uint32_t*), let's properly use u32 for this instead.	2020-02-25 10:24:51 +01:00
Willy Tarreau	26474c486d	CLEANUP: lua: fix aliasing issues in the address matching code Just use read_u32() instead of casting IPv6 addresses to uint32_t*.	2020-02-25 10:24:51 +01:00
Willy Tarreau	296cfd17ef	MINOR: pattern: fix all remaining strict aliasing issues There were still a number of struct casts from various sizes. All of them were now replaced with read_u32(), read_u16(), read_u64() or memcpy().	2020-02-25 10:24:51 +01:00
Willy Tarreau	a8b7ecd4dc	CLEANUP: sample: use read_u64() in ipmask() to apply an IPv6 mask There were 8 strict aliasing warnings there due to the dereferences casting to uint32_t of input and output. We can achieve the same using two write_u64() and four read_u64() which do not cause this issue and even let the compiler use 64-bit operations.	2020-02-25 10:24:14 +01:00
Willy Tarreau	6cde5d883c	CLEANUP: stick-tables: use read_u32() to display a node's key This fixes another aliasing issue that pops up in stick_table.c and peers.c's debug code.	2020-02-25 09:41:22 +01:00
Willy Tarreau	8b5075806d	CLEANUP: cache: use read_u32/write_u32 to access the cache entry's hash Enabling strict aliasing fails on the cache's hash which is a series of 20 bytes cast as u32. And in practice it could even fail on some archs if the http_txn didn't guarantee the hash was properly aligned. Let's use read_u32() to read the value and write_u32() to set it, this makes sure the compiler emits the correct code to access these and knows about the intentional aliasing.	2020-02-25 09:35:07 +01:00
Willy Tarreau	2b9f0664d6	CLEANUP: fd: use a union in fd_rm_from_fd_list() to shut aliasing warnings Enabling strict aliasing fails in fd.c when using the double-word CAS, let's get rid of the (void*)(void)&cur_list junk and use a union instead. This way the compiler knows they do alias.	2020-02-25 09:25:53 +01:00
Willy Tarreau	105599c1ba	BUG/MEDIUM: ssl: fix several bad pointer aliases in a few sample fetch functions Sample fetch functions ssl_x_sha1(), ssl_fc_npn(), ssl_fc_alpn(), ssl_fc_session_id(), as well as the CLI's "show cert details" handler used to dereference the output buffer's <data> field by casting it to "unsigned *". But while doing this could work before 1.9, it broke starting with commit `843b7cbe9d` ("MEDIUM: chunks: make the chunk struct's fields match the buffer struct") which merged chunks and buffers, causing the <data> field to become a size_t. The impact is only on 64-bit platform and depends on the endianness: on little endian, there should never be any non-zero bits in the field as it is supposed to have been zeroed before the call, so it shouldbe harmless; on big endian, the high part of the value only is written instead of the lower one, often making the result appear 4 billion times larger, and making such values dropped everywhere due to being larger than a buffer. It seems that it would be wise to try to re-enable strict-aliasing to catch such errors. This must be backported till 1.9.	2020-02-25 08:59:23 +01:00
Willy Tarreau	5715da269d	BUG/MINOR: sample: fix the json converter's endian-sensitivity About every time there's a pointer cast in the code, there's a hidden bug, and this one was no exception, as it passes the first octet of the native representation of an integer as a single-character string, which obviously only works on little endian machines. On big-endian machines, something as simple as "str(foo),json" only returns zeroes. This bug was introduced with the JSON converter in 1.6-dev1 by commit `317e1c4f1e` ("MINOR: sample: add "json" converter"), the fix may be backported to all stable branches.	2020-02-25 08:47:45 +01:00
Willy Tarreau	908071171b	BUILD: general: always pass unsigned chars to is* functions The isalnum(), isalpha(), isdigit() etc functions from ctype.h are supposed to take an int in argument which must either reflect an unsigned char or EOF. In practice on some platforms they're implemented as macros referencing an array, and when passed a char, they either cause a warning "array subscript has type 'char'" when lucky, or cause random segfaults when unlucky. It's quite unconvenient by the way since none of them may return true for negative values. The recent introduction of cygwin to the list of regularly tested build platforms revealed a lot of breakage there due to the same issues again. So this patch addresses the problem all over the code at once. It adds unsigned char casts to every valid use case, and also drops the unneeded double cast to int that was sometimes added on top of it. It may be backported by dropping irrelevant changes if that helps better support uncommon platforms. It's unlikely to fix bugs on platforms which would already not emit any warning though.	2020-02-25 08:16:33 +01:00
Willy Tarreau	ded15b7564	BUILD: ssl: only pass unsigned chars to isspace() A build failure on cygwin was reported on github actions here: https://github.com/haproxy/haproxy/runs/466507874 It's caused by a signed char being passed to isspace(), and this one being implemented as a macro instead of a function as the man page suggests. It's the same issue that regularly pops up on Solaris. This comes from commit `98263291cc` which was merged in 1.8-dev1. A backport is possible though not incredibly useful.	2020-02-25 07:51:59 +01:00
Tim Duesterhus	017484c80f	CLEANUP: cfgparse: Fix type of second calloc() parameter `curr_idle_thr` is of type `unsigned int`, not `int`. Fix this issue by taking the size of the dereferenced `curr_idle_thr` array. This issue was introduced when adding the `curr_idle_thr` struct member in commit `f131481a0a`. This commit is first tagged in 2.0-dev1 and marked for backport to 1.9.	2020-02-25 07:42:51 +01:00
Willy Tarreau	03e7853581	BUILD: remove obsolete support for -mregparm / USE_REGPARM This used to be a minor optimization on ix86 where registers are scarce and the calling convention not very efficient, but this platform is not relevant enough anymore to warrant all this dirt in the code for the sake of saving 1 or 2% of performance. Modern platforms don't use this at all since their calling convention already defaults to using several registers so better get rid of this once for all.	2020-02-25 07:41:47 +01:00
William Lallemand	3f25ae31bd	BUG/MINOR: ssl: load .key in a directory only after PEM Don't try to load a .key in a directory without loading its associated certificate file. This patch ignores the .key files when iterating over the files in a directory. Introduced by `4c5adbf` ("MINOR: ssl: load the key from a dedicated file").	2020-02-24 16:34:16 +01:00
William Lallemand	4c5adbf595	MINOR: ssl: load the key from a dedicated file For a certificate on a bind line, if the private key was not found in the PEM file, look for a .key and load it. This default behavior can be changed by using the ssl-load-extra-files directive in the global section This feature was mentionned in the issue #221.	2020-02-24 15:39:53 +01:00
Willy Tarreau	02ac950a11	CLEANUP: http/h1: rely on HA_UNALIGNED_LE instead of checking for CPU families Now that we have flags indicating the CPU's capabilities, better use them instead of missing some updates for new CPU families (ARMv8 was missing there).	2020-02-21 16:32:57 +01:00
Willy Tarreau	a7ddab0c25	BUG/MEDIUM: shctx: make sure to keep all blocks aligned The blocksize and the extra field are not necessarily aligned on a machine word. This can result in crashing an align-sensitive machine when initializing the shctx area. Let's round both sizes up to a pointer size to make this safe everywhere. This fixes issue #512. This should be backported as far as 1.8.	2020-02-21 13:45:58 +01:00
Jerome Magnin	4fb196c1d6	CLEANUP: sample: use iststop instead of a for loop In sample_fetch_path we can use iststop() instead of a for loop to find the '?' and return the correct length. This requires commit "MINOR: ist: add an iststop() function".	2020-02-21 11:53:18 +01:00
Jerome Magnin	4bbc9494b7	BUG/MINOR: http: http-request replace-path duplicates the query string In http_action_replace_uri() we call http_get_path() in the case of a replace-path rule. http_get_path() will return an ist pointing to the start of the path, but uri.ptr + uri.len points to the end of the uri. As as result, we are matching against a string containing the query, which we append to the "path" later, effectively duplicating the query string. This patch uses the iststop() function introduced in "MINOR: ist: add an iststop() function" to find the '?' character and update the ist length when needed. This fixes issue #510. The bug was introduced by commit `262c3f1a` ("MINOR: http: add a new "replace-path" action"), which was backported to 2.1 and 2.0.	2020-02-21 11:52:14 +01:00
Willy Tarreau	6e59cb5db1	MINOR: mux-h1: pass CO_RFL_READ_ONCE to the lower layers when relevant When we're in H1_MSG_RQBEFORE or H1_MSG_RPBEFORE, we know that the first message is highly likely the only one and that it's pointless to try to perform a second recvfrom() to complete a first partial read. This is similar to what used to be done in the older I/O methods with the CF_READ_DONTWAIT flag on the channel. So let's pass CO_RFL_READ_ONCE to the transport layer during rcv_buf() in this case. By doing so, in a test involving keep-alive connections with a non-null client think time, we remove 20% of the recvfrom() calls, all of which used to systematically fail. More precisely, we observe a drop from 5.0 recvfrom() per request with 60% failure to 4.0 per request with 50% failure.	2020-02-21 11:38:50 +01:00
Willy Tarreau	716bec2dc6	MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE This flag is currently supported by raw_sock to perform a single recv() attempt and avoid subscribing. Typically on the request and response paths with keep-alive, with short messages we know that it's very likely that the first message is enough.	2020-02-21 11:22:45 +01:00
Willy Tarreau	5d4d1806db	CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv} This marks the end of the transition from the connection polling states introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9. The socket layer can now safely use its FD while all upper layers rely exclusively on subscriptions. These old functions were removed. Some may deserve some renaming to improved clarty though. The single call to conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling() which already does the same.	2020-02-21 11:21:12 +01:00
Willy Tarreau	d1d14c3157	MINOR: connection: remove the last calls to conn_xprt_{want,stop}_* The last few calls to conn_xprt_{want,stop}_{recv,send} in the central connection code were replaced with their strictly exact equivalent fd_*, adding the call to conn_ctrl_ready() when it was missing.	2020-02-21 11:21:12 +01:00
Willy Tarreau	562e0d8619	MINOR: tcp/uxst/sockpair: use fd_want_send() instead of conn_xprt_want_send() Just like previous commit, we don't need to pass through the connection layer anymore to enable polling during a connect(), we know the FD, so let's simply call fd_want_send().	2020-02-21 11:21:12 +01:00
Willy Tarreau	3110eb769b	MINOR: raw_sock: directly call fd_stop_send() and not conn_xprt_stop_send() Now that we know that the connection layer is transparent for polling changes, we have no reason for hiding behind conn_xprt_stop_send() and can safely call fd_stop_send() on the FD once the buffer is empty.	2020-02-21 11:21:12 +01:00
Willy Tarreau	19bc201c9f	MEDIUM: connection: remove the intermediary polling state from the connection Historically we used to require that the connections held the desired polling states for the data layer and the socket layer. Then with muxes these were more or less merged into the transport layer, and now it happens that with all transport layers having their own state, the "transport layer state" as we have it in the connection (XPRT_RD_ENA, XPRT_WR_ENA) is only an exact copy of the undelying file descriptor state, but with a delay. All of this is causing some difficulties at many places in the code because there are still some locations which use the conn_want_* API to remain clean and only rely on connection, and count on a later collection call to conn_cond_update_polling(), while others need an immediate action and directly use the FD updates. Since our updates are now much cheaper, most of them being only an atomic test-and-set operation, and since our I/O callbacks are deferred, there's no benefit anymore in trying to "cache" the transient state change in the connection flags hoping to cancel them before they become an FD event. Better make such calls transparent indirections to the FD layer instead and get rid of the deferred operations which needlessly complicate the logic inside. This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE. A number of functions related to polling updates were either greatly simplified or removed. Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data were expected to be sent after a PROXY protocol or SOCKSv4 header. These ones were simply replaced with a check on the subscription which is where we ought to get the autoritative information from. Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts are the same. conn_stop_polling() and conn_xprt_stop_both() are the same as well. conn_cond_update_polling() only causes errors to stop polling. It also becomes way more obvious that muxes should not at all employ conn_xprt_{want\|stop}_{recv,send}(), and that the call to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer is inappropriate, it ought to unsubscribe from reads instead. All of this definitely requires a serious cleanup.	2020-02-21 11:21:12 +01:00
Willy Tarreau	902871dd07	CLEANUP: epoll: place the struct epoll_event in the stack Historically we used to have a global epoll_event for various manipulations involving epoll_ctl() and when threads were added, this was turned to a thread_local, which is needlessly expensive since it's just a temporary variable. Let's move it to a local variable wherever it's called instead.	2020-02-21 11:21:12 +01:00
Willy Tarreau	7c9d0e1b20	MINOR: checks: do not call conn_xprt_stop_send() anymore While trying to address issue #253, Commit `5909380c` ("BUG/MINOR: checks: stop polling for write when we have nothing left to send") made sure that we stop polling for writes when the buffer is empty. This was actually more a workaround than a bug fix because by doing so we may be stopping polling for an intermediary transport layer without acting on the check itself in case there's SSL or send-proxy in the chain for example, thus the approach is wrong. In practice due to the small size of check requests, this will not have any impact. At best, we ought to unsubscribe for sending, but that's already the case when we arrive in this function. But given that the root cause of the issue was addressed later in commits `cc705a6b`, `c5940392` and `ccf3f6d1`, we can now safely revert this change. It was confirmed on the faulty config that this change doesn't have any effect anymore on the test.	2020-02-21 11:21:12 +01:00
Willy Tarreau	d57e34978d	BUG/MINOR: mux: do not call conn_xprt_stop_recv() on buffer shortage In H1/H2/FCGI, the *_get_buf() functions try to disable receipt of data when there's no buffer available. But they do so at the lowest possible level, which is unrelated to the upper transport layers which may still be trying to feed data based on subscription. The correct approach here would theorically be to only disable subscription, though when we get there, the subscription will already have been dropped, so we can safely just remove that call. It's unlikely that this could have had any practical impact, as the upper xprt layer would call this callback which would fail an not resubscribe. Having the lowest layer disabled would just be temporary since when re-enabling reading, a subscribe at the end of data would re-enable it. Backport should not harm but seems useless at this point.	2020-02-21 11:21:12 +01:00
Christopher Faulet	9d9d645409	BUG/MAJOR: http-ana: Always abort the request when a tarpit is triggered If an client error is reported on the request channel (CF_READ_ERROR) while a session is tarpitted, no error is returned to the client. Concretly, http_reply_and_close() function is not called. This function is reponsible to forward the error to the client. But not only. It is also responsible to abort the request. Because this function is not called when a read error is reported on the request channel, and because the tarpit analyzer is the last one, there is nothing preventing a connection attempt on a server while it is totally unexpected. So, a useless connexion on a backend server may be performed because of this bug. If an HTTP load-balancing algorithm is used on the backend side, it leads to a crash of HAProxy because the request was already erased. If you have tarpit rules and if you use an HTTP load-balancing algorithm on your backends, you must apply this patch. Otherwise a simple TCP reset on a tarpitted connexion will most likely crash your HAProxy. A safe workaround is to use a silent-drop rule or a deny rule instead of a tarpit. This bug also affect the legacy code. It is in fact an very old hidden bug. But the refactoring of process_stream() in the 1.9 makes it visible. And, unfortunately, with the HTX, it is easier to hit it because many processing has been moved in lower layers, in the muxes. It must be backported as far as 1.9. For the 2.0 and the 1.9, the legacy HTTP code must also be patched the same way. For older versions, it may be backported but the bug seems to not impact them. Thanks to Olivier D <webmaster@ajeux.com> to have reported the bug and provided all the infos to analyze it.	2020-02-21 11:18:08 +01:00
Tim Duesterhus	e8aa5f24d6	BUG/MINOR: ssl: Stop passing dynamic strings as format arguments gcc complains rightfully: src/ssl_sock.c: In function ‘ssl_load_global_issuers_from_path’: src/ssl_sock.c:9860:4: warning: format not a string literal and no format arguments [-Wformat-security] ha_warning(warn); ^ Introduced in `70df7bf19c`.	2020-02-19 11:46:18 +01:00
Christopher Faulet	6072beb214	MINOR: http-ana: Match on the path if the monitor-uri starts by a / if the monitor-uri starts by a slash ('/'), the matching is performed against the request's path instead of the request's uri. It is a workaround to let the HTTP/2 requests match the monitor-uri. Indeed, in HTTP/2, clients are encouraged to send absolute URIs only. This patch is not tagged as a bug, because the previous behavior matched exactly what the doc describes. But it may surprise that HTTP/2 requests don't match the monitor-uri. This patch may be backported to 2.1 because URIs of HTTP/2 are stored using the absolute-form starting this version. For previous versions, this patch will only helps explicitely absolute HTTP/1 requests (and only the HTX part because on the legacy HTTP, all the URI is matched). It should fix the issue #509.	2020-02-18 16:29:29 +01:00
Christopher Faulet	d27689e952	BUG/MINOR: http-ana: Matching on monitor-uri should be case-sensitive The monitor-uri should be case-sensitive. In reality, the scheme and the host part are case-insensitives and only the path is case-sensive. But concretely, since the start, the matching on the monitor-uri is case-sensitive. And it is probably the expected behavior of almost all users. This patch must be backported as far as 1.9. For HAProxy 2.0 and 1.9, it must be applied on src/proto_htx.c.	2020-02-18 16:29:23 +01:00
Emmanuel Hocdet	70df7bf19c	MINOR: ssl: add "issuers-chain-path" directive. Certificates loaded with "crt" and "crt-list" commonly share the same intermediate certificate in PEM file. "issuers-chain-path" is a global directive to share intermediate chain certificates in a directory. If certificates chain is not included in certificate PEM file, haproxy will complete chain if issuer match the first certificate of the chain stored via "issuers-chain-path" directive. Such chains will be shared in memory.	2020-02-18 14:33:05 +01:00
Willy Tarreau	23997daf4e	BUG/MINOR: sample: exit regsub() in case of trash allocation error As reported in issue #507, since commiy `07e1e3c93e` ("MINOR: sample: regsub now supports backreferences") we must not proceed in regsub() if we fali to allocate a trash (which in practice never happens). No backport needed.	2020-02-18 14:27:44 +01:00
Christopher Faulet	0f19e43f2e	BUG/MINOR: stream: Don't incr frontend cum_req counter when stream is closed This counter is already incremented when a new request is received (or if an error occurred waiting it). So it must not be incremented when the stream is terminated, at the end of process_strem(). This bug was introduced by the commit `cff0f739e` ("MINOR: counters: Review conditions to increment counters from analysers"). No backport needed.	2020-02-18 11:56:22 +01:00
Christopher Faulet	34b18e4391	BUG/MINOR: http-htx: Don't return error if authority is updated without changes When an Host header is updated, the autority part, if any, is also updated to keep the both syncrhonized. But, when the update is performed while there is no change, a failure is reported while, in reality, no update is necessary. This bug was introduced by the commit `d7b7a1ce5` ("MEDIUM: http-htx: Keep the Host header and the request start-line synchronized"). This commit was pushed in the 2.1. But on this version, the bug is hidden because rewrite errors are silently ignored. And because it happens when there is no change, if the rewrite fails, noone notices it. But since the 2.2, rewrite errors are now fatals by default. So when the bug is hit, a 500 error is returned to the client. Without this fix, a workaround is to disable the strict rewriting mode (see the "strict-mode" HTTP rule). The following HTTP rule is a good way to reproduce the bug if a request with an authority is received. In HTT2, it is pretty common. acl host_header_exists req.hdr(host) -m found http-request set-header host %[req.hdr(host)] if host_header_exists This patch must be backported to 2.1 and everywhere the commit `d7b7a1ce5` is backported. It should fix the issue #494.	2020-02-18 11:19:57 +01:00
Christopher Faulet	9c44e4813c	BUG/MINOR: filters: Count HTTP headers as filtered data but don't forward them In flt_analyze_http_headers() HTTP analyzer, we must not forward systematically the headers. We must only count them as filtered data (ie. increment the offset of the right size). It is the http_payload callback responsibility to decide to forward headers or not by forwarding at least 1 byte of payload. And there is always at least 1 byte of payload to forward, the EOM block. This patch depends on following commits: * MINOR: filters: Forward data only if the last filter forwards something * MINOR: http-htx: Add a function to retrieve the headers size of an HTX message This patch must be backported with commits above as far as 1.9. In HAProxy 2.0 and 1.9, the patch must be adapted because of the legacy HTTP code.	2020-02-18 11:19:57 +01:00
Christopher Faulet	71179a3ea9	MINOR: filters: Forward data only if the last filter forwards something In flt_tcp_payload() and flt_http_payload(), if the last filter does not forwarding anything, nothing is forwarded, not even the already filtered data. For now, this patch is useless because the last filter is always sync with the stream's offset. But it will be mandatory for a bugfix.	2020-02-18 11:19:57 +01:00
Christopher Faulet	727a3f1ca3	MINOR: http-htx: Add a function to retrieve the headers size of an HTX message http_get_hdrs_size() function may now be used to get the bytes held by headers in an HTX message. It only works if the headers were not already forwarded. Metadata are not counted here.	2020-02-18 11:19:57 +01:00
Jerome Magnin	07e1e3c93e	MINOR: sample: regsub now supports backreferences Now that the configuration parser is more flexible with samples, converters and their arguments, we can leverage this to enable support for backreferences in regsub.	2020-02-16 19:48:54 +01:00
Willy Tarreau	9af749b43e	BUG/MINOR: arg: fix again incorrect argument length check Recent commit `807aef8a14` ("BUG/MINOR: arg: report an error if an argument is larger than bufsize") aimed at fixing the argument length check but relied on the fact that the end of string was not reached, except that it forgot to consider the delimiters (comma and parenthesis) which are valid conditions to break out of the loop. This used to break simple expressions like "hdr(xff,1)". Thanks to J�r�me for reporting this. No backport is needed.	2020-02-16 10:49:55 +01:00
Willy Tarreau	807aef8a14	BUG/MINOR: arg: report an error if an argument is larger than bufsize Commit `ef21facd99` ("MEDIUM: arg: make make_arg_list() support quotes in arguments") removed the chunk_strncpy() to fill the trash buffer as the input is being parsed, and accidently dropped the jump to the error path in case the argument is too large, which is now fixed. No backport is needed, this is for 2.2. This addresses issue #502.	2020-02-15 14:54:28 +01:00
Willy Tarreau	cd0d2ed6ee	MEDIUM: log-format: make the LF parser aware of sample expressions' end For a very long time it used to be impossible to pass a closing square bracket as a valid character in argument to a sample fetch function or to a converter because the LF parser used to stop on the first such character found and to pass what was between the first '[' and the first ']' to sample_parse_expr(). This patch addresses this by passing the whole string to sample_parse_expr() which is the only one authoritative to indicate the first character that does not belong to the expression. The LF parser then verifies it matches a ']' or fails. As a result it is finally possible to write rules such as the following, which is totally valid an unambigous : http-request redirect location %[url,regsub([.:/?-],!,g)] \|-----\| \| \| arg1 \| `---> arg3 `-----> arg2 \|-----------------\| converter \|---------------------\| sample expression \|------------------------\| log-format tag	2020-02-14 19:02:06 +01:00
Willy Tarreau	e3b57bf92f	MINOR: sample: make sample_parse_expr() able to return an end pointer When an end pointer is passed, instead of complaining that a comma is missing after a keyword, sample_parse_expr() will silently return the pointer to the current location into this return pointer so that the caller can continue its parsing. This will be used by more complex expressions which embed sample expressions, and may even permit to embed sample expressions into arguments of other expressions.	2020-02-14 19:02:06 +01:00
Willy Tarreau	ef21facd99	MEDIUM: arg: make make_arg_list() support quotes in arguments Now it becomes possible to reuse the quotes within arguments, allowing the parser to distinguish a ',' or ')' that is part of the value from one which delimits the argument. In addition, ',' and ')' may be escaped using a backslash. However, it is also important to keep in mind that just like in shell, quotes are first resolved by the word tokenizer, so in order to pass quotes that are visible to the argument parser, a second level is needed, either using backslash escaping, or by using an alternate type. For example, it's possible to write this to append a comma: http-request add-header paren-comma-paren "%[str('(--,--)')]" or this: http-request add-header paren-comma-paren '%[str("(--,--)")]' or this: http-request add-header paren-comma-paren %[str(\'(--,--)\')] or this: http-request add-header paren-comma-paren %[str(\"(--,--)\")] or this: http-request add-header paren-comma-paren %[str(\"(\"--\',\'--\")\")] Note that due to the wide use of '\' in front of parenthesis in regex, the backslash character will purposely not escape parenthesis, so that '\)' placed in quotes is passed verbatim to a regex engine.	2020-02-14 19:02:06 +01:00
Willy Tarreau	338c670745	MEDIUM: arg: copy parsed arguments into the trash instead of allocating them For each and every argument parsed by make_arg_list(), there was an strndup() call, just so that we have a trailing zero for most functions, and this temporary buffer is released afterwards except for strings where it is kept. Proceeding like this is not convenient because 1) it performs a huge malloc/free dance, and 2) it forces to decide upfront where the argument ends, which is what prevents commas and right parenthesis from being used. This patch makes the function copy the temporary argument into the trash instead, so that we avoid the malloc/free dance for most all non-string args (e.g. integers, addresses, time, size etc), and that we can later produce the contents on the fly while parsing the input. It adds a length check to make sure that the argument is not longer than the buffer size, which should obviously never be the case but who knows what people put in their configuration.	2020-02-14 19:02:06 +01:00
Willy Tarreau	80b53ffb1c	MEDIUM: arg: make make_arg_list() stop after its own arguments The main problem we're having with argument parsing is that at the moment the caller looks for the first character looking like an end of arguments (')') and calls make_arg_list() on the sub-string inside the parenthesis. Let's first change the way it works so that make_arg_list() also consumes the parenthesis and returns the pointer to the first char not consumed. This will later permit to refine each argument parsing. For now there is no functional change.	2020-02-14 19:02:06 +01:00
Willy Tarreau	ed2c662b01	MINOR: sample/acl: use is_idchar() to locate the fetch/conv name Instead of scanning a string looking for an end of line, ')' or ',', let's only accept characters which are actually valid identifier characters. This will let the parser know that in %[src], only "src" is the sample fetch name, not "src]". This was done both for samples and ACLs since they are the same here.	2020-02-14 19:02:06 +01:00
Christopher Faulet	6c57f2da43	MINOR: mux-fcgi: Make the capture of the path-info optional in pathinfo regex Now, only one capture is mandatory in the path-info regex, the one matching the script-name. The path-info capture is optional. Of couse, it must be defined to fill the PATH_INFO parameter. But it is not mandatory. This way, it is possible to get the script-name part from the path, excluding the path-info. This patch is small enough to be backported to 2.1.	2020-02-14 18:31:29 +01:00
Christopher Faulet	28cb36613b	BUG/MINOR: mux-fcgi: Forbid special characters when matching PATH_INFO param If a regex to match the PATH_INFO parameter is configured, it systematically fails if a newline or a null character is present in the URL-decoded path. So, from the moment there is at least a "%0a" or a "%00" in the request path, we always fail to get the PATH_INFO parameter and all the decoded path is used for the SCRIPT_NAME parameter. It is probably not the expected behavior. Because, most of time, these characters are not expected at all in a path, an error is now triggered when one of these characters is found in the URL-decoded path before trying to execute the path_info regex. However, this test is not performed if there is no regex configured. Note that in reality, the newline character is only a problem when HAProxy is complied with pcre or pcre2 library and conversely, the null character is only a problem for the libc's regex library. But both are always excluded to avoid any inconsistency depending on compile options. An alternative, not implemented yet, is to replace these characters by another one. If someone complains about this behavior, it will be re-evaluated. This patch must be backported to all versions supporting the FastCGI applications, so to 2.1 for now.	2020-02-14 16:02:35 +01:00
Olivier Houchard	12ffab03b6	BUG/MEDIUM: muxes: Use the right argument when calling the destroy method. When calling the mux "destroy" method, the argument should be the mux context, not the connection. In a few instances in the mux code, the connection was used (mainly when the session wouldn't handle the idle connection, and the server pool was fool), and that could lead to random segfaults. This should be backported to 2.1, 2.0, and 1.9	2020-02-14 13:28:38 +01:00
William Dauchy	f7dcdc8a6f	BUG/MINOR: namespace: avoid closing fd when socket failed in my_socketat we cannot return right after socket opening as we need to move back to the default namespace first this should fix github issue #500 this might be backported to all version >= 1.6 Fixes: `b3e54fe387` ("MAJOR: namespace: add Linux network namespace support") Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-02-14 04:23:08 +01:00
William Dauchy	97a7bdac3e	BUG/MINOR: tcp: don't try to set defaultmss when value is negative when `getsockopt` previously failed, we were trying to set defaultmss with -2 value. this is a followup of github issue #499 this should be backported to all versions >= v1.8 Fixes: `153659f1ae` ("MINOR: tcp: When binding socket, attempt to reuse one from the old proc.") Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-02-12 16:01:50 +01:00
William Dauchy	c0e23aef05	BUG/MINOR: tcp: avoid closing fd when socket failed in tcp_bind_listener we were trying to close file descriptor even when `socket` call was failing. this should fix github issue #499 this should be backported to all versions >= v1.8 Fixes: `153659f1ae` ("MINOR: tcp: When binding socket, attempt to reuse one from the old proc.") Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-02-12 15:24:21 +01:00
Willy Tarreau	0948a781fc	BUG/MINOR: listener: enforce all_threads_mask on bind_thread on init When intializing a listener, let's make sure the bind_thread mask is always limited to all_threads_mask when inserting the FD. This will avoid seeing listening FDs with bits corresponding to threads that are not active (e.g. when using "bind ... process 1/even"). The side effect is very limited, all that was identified is that atomic operations are used in fd_update_events() when not necessary. It's more a matter of long-term correctness in practice. This fix might be backported as far as 1.8 (then proto_sockpair must be dropped).	2020-02-12 10:21:49 +01:00
Willy Tarreau	50b659476c	BUG/MEDIUM: listener: only consider running threads when resuming listeners In bug #495 we found that it is possible to resume a listener on an inexistent thread. This happens when a bind's thread_mask contains bits out of the active threads mask, such as when using "1/odd" or "1/even". The thread_mask was used as-is to pick a thread number to re-enable the listener, and given that the highest number is used, 1/odd or 1/even can produce quite high thread numbers and crash the process by queuing some entries into non-existent lists. This bug is an incomplete fix of commit 413e926ba ("BUG/MAJOR: listener: fix thread safety in resume_listener()") though it will only trigger if some bind lines are explicitly bound to thread numbers higher than the thread count. The fix must be backported to all branches having the fix above (as far as 1.8, though the code is different there, see the commit message in 1.8 for changes). There are a few other places where bind_thread is used without enforcing all_thread_mask, namely when doing fd_insert() while creating listeners. It seems harmless but would probably deserve another fix.	2020-02-12 10:21:33 +01:00
Willy Tarreau	e35d1d4f42	BUILD: http_act: cast file sizes when reporting file size error As seen in issue #496, st_size may be of varying types on different systems. Let's simply cast it to long long and use long long for all size outputs.	2020-02-11 10:58:56 +01:00
Willy Tarreau	157788c7b1	BUG/MINOR: connection: correctly retry I/O on signals Issue #490 reports that there are a few bogus constructs of the famous "do { if (cond) continue; } while (0)" in the connection code, that are used to retry on I/O failures caused by receipt of a signal. Let's turn them into the more correct "while (1) { if (cond) continue; break }" instead. This may or may not be backported, it shouldn't have any visible effect.	2020-02-11 10:26:39 +01:00
Willy Tarreau	327ea5aec8	BUG/MINOR: unix: better catch situations where the unix socket path length is close to the limit We do have some checks for the UNIX socket path length to validate the full pathname of a unix socket but the pathname extension is only taken into account when using a bind_prefix. The second check only matches against MAXPATHLEN. So this means that path names between 98 and 108 might successfully parse but fail to bind. Let's adjust the check in the address parser and refine the error checking at the bind() step. This addresses bug #493.	2020-02-11 06:49:42 +01:00
Willy Tarreau	508f989758	BUG/MAJOR: mux-h2: don't wake streams after connection was destroyed In commit `477902b` ("MEDIUM: connections: Get ride of the xprt_done callback.") we added an inconditional call to h2_wake_some_streams() in h2_wake(), though we must not do it if the connection is destroyed or we end up with a use-after-free. In this case it's already done in h2_process() before destroying the connection anyway. Let's just add this test for now. A cleaner approach might consist in doing it in the h2_process() function itself when a connection status change is detected. No backport is needed, this is purely 2.2.	2020-02-11 04:42:05 +01:00
Christopher Faulet	67307796e6	BUG/MEDIUM: tcp-rules: Fix track-sc* actions for L4/L5 TCP rules A bug was introduced during TCP rules refactoring by the commit `ac98d81f4` ("MINOR: http-rule/tcp-rules: Make track-sc* custom actions"). There is no stream when L4/L5 TCP rules are evaluated. For these rulesets, In track-sc* actions, we must take care to rely on the session instead of the stream. Because of this bug, any evaluation of L4/L5 TCP rules using a track-sc* action leads to a crash of HAProxy. No backport needed, except if the above commit is backported.	2020-02-10 10:09:58 +01:00
William Lallemand	696f317f13	BUG/MEDIUM: ssl/cli: 'commit ssl cert' wrong SSL_CTX init The code which is supposed to apply the bind_conf configuration on the SSL_CTX was not called correctly. Indeed it was called with the previous SSL_CTX so the new ones were left with default settings. For example the ciphers were not changed. This patch fixes #429. Must be backported in 2.1.	2020-02-07 20:55:35 +01:00
Christopher Faulet	817c4e39e5	BUG/MINOR: http-act: Fix bugs on error path during parsing of return actions This patch fixes memory leaks and a null pointer dereference found by coverity on the error path when an HTTP return action is parsed. See issue #491. No need to backport this patch except the HTT return action is backported too.	2020-02-07 10:37:59 +01:00
Christopher Faulet	692a6c2e69	BUG/MINOR: http-act: Set stream error flag before returning an error In action_http_set_status(), when a rewrite error occurred, the stream error flag must be set before returning the error. No need to backport this patch except if commit `333bf8c33` ("MINOR: http-rules: Set SF_ERR_PRXCOND termination flag when a header rewrite fails") is backported. This bug was reported in issue #491.	2020-02-07 10:37:53 +01:00
Tim Duesterhus	f1bc24cb27	BUG/MINOR: acl: Fix type of log message when an acl is named 'or' The patch adding this check initially only issued a warning, instead of being fatal. It was changed before committing. However when making this change the type of the log message was not changed from `ha_warning` to `ha-alert`. This patch makes this forgotten adjustment. see `0cf811a5f9` No backport needed. The initial patch was backported as a warning, thus the log message type is correct.	2020-02-06 22:16:07 +01:00
Tim Duesterhus	0cf811a5f9	MINOR: acl: Warn when an ACL is named 'or' Consider a configuration like this: > acl t always_true > acl or always_false > > http-response set-header Foo Bar if t or t The 'or' within the condition will be treated as a logical disjunction and the header will be set, despite the ACL 'or' being falsy. This patch makes it an error to declare such an ACL that will never work. This patch may be backported to stable releases, turning the error into a warning only (the code was written in a way to make this trivial). It should not break anything and might improve the users' lifes.	2020-02-06 16:08:36 +01:00
Willy Tarreau	9d6bb5a546	BUILD: lua: silence a warning on systems where longjmp is not marked as noreturn If the longjmp() call is not flagged as "noreturn", for example, because the operating system doesn't target a gcc-compatible compiler, we may get this warning when building Lua : src/hlua.c: In function 'hlua_panic_ljmp': src/hlua.c:128:1: warning: no return statement in function returning non-void [-Wreturn-type] static int hlua_panic_ljmp(lua_State *L) { longjmp(safe_ljmp_env, 1); } ^~~~~~ The function's prototype cannot be changed because it must be compatible with Lua's callbacks. Let's simply enclose the call inside WILL_LJMP() which we created exactly to signal a call to longjmp(). It lets the compiler know we won't get back into the function and that the return statement is not needed.	2020-02-06 16:01:04 +01:00
Christopher Faulet	700d9e88ad	MEDIUM: lua: Add ability for actions to intercept HTTP messages It is now possible to intercept HTTP messages from a lua action and reply to clients. To do so, a reply object must be provided to the function txn:done(). It may contain a status code with a reason, a header list and a body. By default, if an empty reply object is used, an empty 200 response is returned. If no reply is passed when txn:done() is called, the previous behaviour is respected, the transaction is terminated and nothing is returned to the client. The same is done for TCP streams. When txn:done() is called, the action is terminated with the code ACT_RET_DONE on success and ACT_RET_ERR on error, interrupting the message analysis. The reply object may be created for the lua, by hand. Or txn:reply() may be called. If so, this object provides some methods to fill it: * Reply:set_status(<status> [ <reason>]) : Set the status and optionally the reason. If no reason is provided, the default one corresponding to the status code is used. * Reply:add_header(<name>, <value>) : Add a header. For a given name, the values are stored in an ordered list. * Reply:del_header(<name>) : Removes all occurrences of a header name. * Reply:set_body(<body>) : Set the reply body. Here are some examples, all doing the same: -- ex. 1 txn:done{ status = 400, reason = "Bad request", headers = { ["content-type"] = { "text/html" }, ["cache-control"] = { "no-cache", "no-store" }, }, body = "<html><body><h1>invalid request<h1></body></html>" } -- ex. 2 local reply = txn:reply{ status = 400, reason = "Bad request", headers = { ["content-type"] = { "text/html" }, ["cache-control"] = { "no-cache", "no-store" } }, body = "<html><body><h1>invalid request<h1></body></html>" } txn:done(reply) -- ex. 3 local reply = txn:reply() reply:set_status(400, "Bad request") reply:add_header("content-length", "text/html") reply:add_header("cache-control", "no-cache") reply:add_header("cache-control", "no-store") reply:set_body("<html><body><h1>invalid request<h1></body></html>") txn:done(reply)	2020-02-06 15:13:04 +01:00
Christopher Faulet	2c2c2e381b	MINOR: lua: Add act:wake_time() function to set a timeout when an action yields This function may be used to defined a timeout when a lua action returns act:YIELD. It is a way to force to reexecute the script after a short time (defined in milliseconds). Unlike core:sleep() or core:yield(), the script is fully reexecuted if it returns act:YIELD. With core functions to yield, the script is interrupted and restarts from the yield point. When a script returns act:YIELD, it is finished but the message analysis is blocked on the action waiting its end.	2020-02-06 15:13:04 +01:00
Christopher Faulet	0f3c8907c3	MINOR: lua: Create the global 'act' object to register all action return codes ACT_RET_* code are now available from lua scripts. The gloabl object "act" is used to register these codes as constant. Now, lua actions can return any of following codes : * act.CONTINUE for ACT_RET_CONT * act.STOP for ACT_RET_STOP * act.YIELD for ACT_RET_YIELD * act.ERROR for ACT_RET_ERR * act.DONE for ACT_RET_DONE * act.DENY for ACT_RET_DENY * act.ABORT for ACT_RET_ABRT * act.INVALID for ACT_RET_INV For instance, following script denied all requests : core.register_action("deny", { "http-req" }, function (txn) return act.DENY end) Thus "http-request lua.deny" do exactly the same than "http-request deny".	2020-02-06 15:13:03 +01:00
Christopher Faulet	7716cdf450	MINOR: lua: Get the action return code on the stack when an action finishes When an action successfully finishes, the action return code (ACT_RET_*) is now retrieve on the stack, ff the first element is an integer. In addition, in hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before exiting. Thus, when a script uses this function, the corresponding action still finishes with the good code. Thanks to this change, the flag HLUA_STOP is now useless. So it has been removed. It is a mandatory step to allow a lua action to return any action return code.	2020-02-06 15:13:03 +01:00
Christopher Faulet	a20a653e07	BUG/MINOR: http-ana: Increment failed_resp counters on invalid response In http_process_res_common() analyzer, when a invalid response is reported, the failed_resp counters must be incremented. No need to backport this patch, except if the commit `b8a5371a` ("MEDIUM: http-ana: Properly handle internal processing errors") is backported too.	2020-02-06 15:13:03 +01:00
Christopher Faulet	07a718e712	CLEANUP: lua: Remove consistency check for sample fetches and actions It is not possible anymore to alter the HTTP parser state from lua sample fetches or lua actions. So there is no reason to still check for the parser state consistency.	2020-02-06 15:13:03 +01:00
Christopher Faulet	4a2c142779	MEDIUM: http-rules: Support extra headers for HTTP return actions It is now possible to append extra headers to the generated responses by HTTP return actions, while it is not based on an errorfile. For return actions based on errorfiles, these extra headers are ignored. To define an extra header, a "hdr" argument must be used with a name and a value. The value is a log-format string. For instance: http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"	2020-02-06 15:13:03 +01:00
Christopher Faulet	24231ab61f	MEDIUM: http-rules: Add the return action to HTTP rules Thanks to this new action, it is now possible to return any responses from HAProxy, with any status code, based on an errorfile, a file or a string. Unlike the other internal messages generated by HAProxy, these ones are not interpreted as errors. And it is not necessary to use a file containing a full HTTP response, although it is still possible. In addition, using a log-format string or a log-format file, it is possible to have responses with a dynamic content. This action can be used on the request path or the response path. The only constraint is to have a responses smaller than a buffer. And to avoid any warning the buffer space reserved to the headers rewritting should also be free. When a response is returned with a file or a string as payload, it only contains the content-length header and the content-type header, if applicable. Here are examples: http-request return content-type image/x-icon file /var/www/favicon.ico \ if { path /favicon.ico } http-request return status 403 content-type text/plain \ lf-string "Access denied. IP %[src] is blacklisted." \ if { src -f /etc/haproxy/blacklist.lst }	2020-02-06 15:12:54 +01:00
Christopher Faulet	6d0c3dfac6	MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding This patch introduces the 'http-after-response' rules. These rules are evaluated at the end of the response analysis, just before the data forwarding, on ALL HTTP responses, the server ones but also all responses generated by HAProxy. Thanks to this ruleset, it is now possible for instance to add some headers to the responses generated by the stats applet. Following actions are supported : * allow * add-header * del-header * replace-header * replace-value * set-header * set-status * set-var * strict-mode * unset-var	2020-02-06 14:55:34 +01:00
Christopher Faulet	a72a7e49e8	MINOR: http-ana/http-rules: Use dedicated function to forward internal responses Call http_forward_proxy_resp() function when an internal response is returned. It concerns redirect, auth and error reponses. But also 100-Continue and 103-Early-Hints responses. For errors, there is a subtlety. if the forward fails, an HTTP 500 error is generated if it is not already an internal error. For now http_forward_proxy_resp() cannot fail. But it will be possible when the new ruleset applied on all responses will be added.	2020-02-06 14:55:34 +01:00
Christopher Faulet	ef70e25035	MINOR: http-ana: Add a function for forward internal responses Operations performed when internal responses (redirect/deny/auth/errors) are returned are always the same. The http_forward_proxy_resp() function is added to group all of them under a unique function.	2020-02-06 14:55:34 +01:00
Christopher Faulet	72c7d8d040	MINOR: http-ana: Rely on http_reply_and_close() to handle server error The http_server_error() function now relies on http_reply_and_close(). Both do almost the same actions. In addtion, http_server_error() sets the error flag and the final state flag on the stream.	2020-02-06 14:55:34 +01:00
Christopher Faulet	60b33a5a62	MINOR: http-rules: Handle the rule direction when a redirect is evaluated The rule direction must be tested to do specific processing on the request path. intercepted_req counter shoud be updated if the rule is evaluated on the frontend and remaining request's analyzers must be removed. But only on the request path. The rule direction must also be tested to set the right final stream state flag. This patch depends on the commit "MINOR: http-rules: Add a flag on redirect rules to know the rule direction". Both must be backported to all stable versions.	2020-02-06 14:55:34 +01:00
Christopher Faulet	c87e468816	MINOR: http-rules: Add a flag on redirect rules to know the rule direction HTTP redirect rules can be evaluated on the request or the response path. So when a redirect rule is evaluated, it is important to have this information because some specific processing may be performed depending on the direction. So the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the redirect rule during the parsing. This patch is mandatory to fix a bug on redirect rule. It must be backported to all stable versions.	2020-02-06 14:55:34 +01:00
Christopher Faulet	c20afb810f	BUG/MINOR: http-ana: Set HTX_FL_PROXY_RESP flag if a server perform a redirect It is important to not forget to specify the HTX resposne was internally generated when a server perform a redirect. This information is used by the H1 multiplexer to choose the right connexion mode when the response is sent to the client. This patch must be backported to 2.1.	2020-02-06 14:55:34 +01:00
Christopher Faulet	7a138dc908	BUG/MINOR: http-ana: Reset HTX first index when HAPRoxy sends a response The first index in an HTX message is the HTX block index from which the HTTP analysis must be performed. When HAProxy sends an HTTP response, on error or redirect, this index must be reset because all pending incoming data are considered as forwarded. For now, it is only a bug for 103-Early-Hints response. For other responses, it is not a problem. But it will be when the new ruleset applied on all responses will be added. For 103 responses, if the first index is not reset, if there are rewritting rules on server responses, the generated 103 responses, if any, are evaluated too. This patch must be backported and probably adapted, at least for 103 responses, as far as 1.9.	2020-02-06 14:55:34 +01:00
Christopher Faulet	3b2bb63ded	MINOR: dns: Add function to release memory allocated for a do-resolve rule Memory allocated when a do-resolve rule is parsed is now released when HAProxy exits.	2020-02-06 14:55:34 +01:00
Christopher Faulet	a4168434a7	MINOR: dns: Dynamically allocate dns options to reduce the act_rule size <.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated when a do-resolve rule is parsed. This drastically reduces the structure size.	2020-02-06 14:55:34 +01:00
Christopher Faulet	637259e044	BUG/MINOR: http-ana: Don't overwrite outgoing data when an error is reported When an error is returned to a client, the right message is injected into the response buffer. It is performed by http_server_error() or http_replay_and_close(). Both ignore any data already present into the channel's buffer. While it is legitimate to remove all input data, it is important to not remove any outgoing data. So now, we try to append the error message to the response buffer, only removing input data. We rely on the channel_htx_copy_msg() function to do so. So this patch depends on the following two commits: * MINOR: htx: Add a function to append an HTX message to another one * MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer This patch must be backported as far as 1.9. However, above patches must be backported first.	2020-02-06 14:55:34 +01:00
Christopher Faulet	0ea0c86753	MINOR: htx: Add a function to append an HTX message to another one the htx_append_msg() function can now be used to append an HTX message to another one. All the message is copied or nothing. If an error occurs during the copy, all changes are rolled back. This patch is mandatory to fix a bug in http_reply_and_close() function. Be careful to backport it first.	2020-02-06 14:54:47 +01:00
Christopher Faulet	0a589fde7c	MINOR: http-htx: Emit a warning if an error file runs over the buffer's reserve If an error file is too big and, once converted in HTX, runs over the buffer space reserved to headers rewritting, a warning is emitted. Because a new set of rules will be added to allow headers rewritting on all responses, including HAProxy ones, it is important to always keep this space free for error files.	2020-02-06 09:36:36 +01:00
Christopher Faulet	333bf8c33f	MINOR: http-rules: Set SF_ERR_PRXCOND termination flag when a header rewrite fails When a header rewrite fails, an internal errors is triggered. But SF_ERR_INTERNAL is documented to be the concequence of a bug and must be reported to the dev teamm. So, when this happens, the SF_ERR_PRXCOND termination flag is set now.	2020-02-06 09:36:36 +01:00
Christopher Faulet	546c4696bb	MINOR: global: Set default tune.maxrewrite value during global structure init When the global structure is initialized, instead of setting tune.maxrewrite to -1, its default value can be immediately set. This way, it is always defined during the configuration validity check. Otherwise, the only way to have it at this stage, it is to explicity set it in the global section.	2020-02-06 09:36:36 +01:00
Christopher Faulet	91e31d83c9	BUG/MINOR: http-act: Use the good message to test strict rewritting mode Since the strict rewritting mode was introduced, actions manipulating headers (set/add/replace) always rely on the request message to test if the HTTP_MSGF_SOFT_RW flag is set or not. But, of course, we must only rely on the request for http-request rules. For http-response rules, we must use the response message. This patch must be backported if the strict rewritting is backported too.	2020-02-06 09:36:36 +01:00
Tim Duesterhus	d02ffe9b6d	CLEANUP: peers: Remove unused static function `free_dcache_tx` The function was added in commit `6c39198b57`, but was also used within a single function `free_dcache` which was unused itself. see issue #301 see commit `10ce0c2f31` which removed `free_dcache`	2020-02-05 23:40:17 +01:00
Tim Duesterhus	10ce0c2f31	CLEANUP: peers: Remove unused static function `free_dcache` The function was changed to be static in commit `6c39198b57`, but even that commit no longer uses it. The purpose of the change vs. outright removal is unclear. see issue #301	2020-02-05 18:49:29 +01:00
Willy Tarreau	077d366ef7	CLEANUP: hpack: remove a redundant test in the decoder As reported in issue #485 the test for !len at the end of the loop in get_var_int() is useless since it was already done inside the loop. Actually the code is more readable if we remove the first one so let's do this instead. The resulting code is exactly the same since the compiler already optimized the test away.	2020-02-05 15:39:08 +01:00
William Lallemand	4dd145a888	BUG/MINOR: ssl: clear the SSL errors on DH loading failure In ssl_sock_load_dh_params(), if haproxy failed to apply the dhparam with SSL_CTX_set_tmp_dh(), it will apply the DH with SSL_CTX_set_dh_auto(). The problem is that we don't clean the OpenSSL errors when leaving this function so it could fail to load the certificate, even if it's only a warning. Fixes bug #483. Must be backported in 2.1.	2020-02-05 15:32:24 +01:00
Willy Tarreau	731248f0db	BUG/MINOR: ssl: we may only ignore the first 64 errors We have the ability per bind option to ignore certain errors (CA, crt, ...), and for this we use a 64-bit field. In issue #479 coverity reports a risk of too large a left shift. For now as of OpenSSL 1.1.1 the highest error value that may be reported by X509_STORE_CTX_get_error() seems to be around 50 so there should be no risk yet, but it's enough of a warning to add a check so that we don't accidently hide random errors in the future. This may be backported to relevant stable branches.	2020-02-04 14:04:36 +01:00
William Lallemand	3af48e706c	MINOR: ssl: ssl-load-extra-files configure loading of files This new setting in the global section alters the way HAProxy will look for unspecified files (.ocsp, .sctl, .issuer, bundles) during the loading of the SSL certificates. By default, HAProxy discovers automatically a lot of files not specified in the configuration, and you may want to disable this behavior if you want to optimize the startup time. This patch sets flags in global_ssl.extra_files and then check them before trying to load an extra file.	2020-02-03 17:50:26 +01:00
Olivier Houchard	04f5fe87d3	BUG/MEDIUM: memory: Add a rwlock before freeing memory. When using lockless pools, add a new rwlock, flush_pool. read-lock it when getting memory from the pool, so that concurrenct access are still authorized, but write-lock it when we're about to free memory, in pool_flush() and pool_gc(). The problem is, when removing an item from the pool, we unreference it to get the next one, however, that pointer may have been free'd in the meanwhile, and that could provoke a crash if the pointer has been unmapped. It should be OK to use a rwlock, as normal operations will still be able to access the pool concurrently, and calls to pool_flush() and pool_gc() should be pretty rare. This should be backported to 2.1, 2.0 and 1.9.	2020-02-01 18:08:34 +01:00
Olivier Houchard	8af97eb4a1	MINOR: memory: Only init the pool spinlock once. In pool_create(), only initialize the pool spinlock if we just created the pool, in the event we're reusing it, there's no need to initialize it again.	2020-02-01 18:08:34 +01:00
Olivier Houchard	b6fa08bc7b	BUG/MEDIUM: memory_pool: Update the seq number in pool_flush(). In pool_flush(), we can't just set the free_list to NULL, or we may suffer the ABA problem. Instead, use a double-width CAS and update the sequence number. This should be backported to 2.1, 2.0 and 1.9. This may, or may not, be related to github issue #476.	2020-02-01 18:08:34 +01:00
Willy Tarreau	952c2640b0	MINOR: task: don't set TASK_RUNNING on tasklets We can't clear flags on tasklets because we don't know if they're still present upon return (they all return NULL, maybe that could change in the future). As a side effect, once TASK_RUNNING is set, it's never cleared anymore, which is misleading and resulted in some incorrect flagging of bulk tasks in the recent scheduler changes. And the only reason for setting TASK_RUNNING on tasklets was to detect self-wakers, which is not done using a dedicated flag. So instead of setting this flags for no opportunity to clear it, let's simply not set it.	2020-01-31 18:37:03 +01:00
Willy Tarreau	1dfc9bbdc6	OPTIM: task: readjust CPU bandwidth distribution since last update Now that we can more accurately watch which connection is really being woken up from itself, it was desirable to re-adjust the CPU BW thresholds based on measurements. New tests with 60000 concurrent connections were run at 100 Gbps with unbounded queues and showed the following distribution: scenario TC0 TC1 TC2 observation -------------------+---+---+----+--------------------------- TCP conn rate : 32, 51, 17 HTTP conn rate : 34, 41, 25 TCP byte rate : 2, 3, 95 (2 MB objets) splicing byte rate: 11, 6, 83 (2 MB objets) H2 10k object : 44, 23, 33 client-limited mixed traffic : 18, 10, 72 21m+10: 11kcps, 36 Gbps The H2 experienced a huge change since it uses a persistent connection that was accidently flagged in the previous test. The splicing test exhibits a higher need for short tasklets, so does the mixed traffic test. Given that latency mainly matters for conn rate and H2 here, the ratios were readjusted as 33% for TC0, 50% for TC1 and 17% for TC2, keeping in mind that whatever is not consumed by one class is automatically shared in equal propertions by the next one(s). This setting immediately provided a nice improvement as with the default settings (maxpollevents=200, runqueue-depth=200), the same ratios as above are still reported, while the time to request "show activity" on the CLI dropped to 30-50ms. The average loop time is around 5.7ms on the mixed traffic. In addition, one extra stress test at 90.5 Gbps with 5100 conn/s shows 70-100ms CLI request time, with an average loop time of 17 ms.	2020-01-31 18:37:01 +01:00
Willy Tarreau	d23d413e38	MINOR: task: make sched->current also reflect tasklets sched->current is used to know the current task/tasklet, and is currently only used by the panic dump code. However it turns out it was not set for tasklets, which prevents us from using it for more usages, despite the panic handling code already handling this case very well. Let's make sure it's now set.	2020-01-31 17:45:10 +01:00
Willy Tarreau	bb238834da	MINOR: task: permanently flag tasklets waking themselves up Commit `a17664d829` ("MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet") tried to inflict a penalty to self-requeuing tasks/tasklets which correspond to those involved in large, high-latency data transfers, for the benefit of all other processing which requires a low latency. However, it turns out that while it ought to do this on a case-by-case basis, basing itself on the RUNNING flag isn't accurate because this flag doesn't leave for tasklets, so we'd rather need a distinct flag to tag such tasklets. This commit introduces TASK_SELF_WAKING to mark tasklets acting like this. For now it's still set when TASK_RUNNING is present but this will have to change. The flag is kept across wakeups.	2020-01-31 17:45:10 +01:00
Olivier Houchard	849d4f047f	BUG/MEDIUM: connections: Don't forget to unlock when killing a connection. Commit `140237471e` made sure we hold the toremove_lock for the corresponding thread before removing a connection from its idle_orphan_conns list, however it failed to unlock it if we found a connection, leading to a deadlock, so add the missing deadlock. This should be backported to 2.1 and 2.0.	2020-01-31 17:25:37 +01:00
Willy Tarreau	c633607c06	OPTIM: task: refine task classes default CPU bandwidth ratios Measures with unbounded execution ratios under 40000 concurrent connections at 100 Gbps showed the following CPU bandwidth distribution between task classes depending on traffic scenarios: scenario TC0 TC1 TC2 observation -------------------+---+---+----+--------------------------- TCP conn rate : 29, 48, 23 221 kcps HTTP conn rate : 29, 47, 24 200 kcps TCP byte rate : 3, 5, 92 53 Gbps splicing byte rate: 5, 10, 85 70 Gbps H2 10k object : 10, 21, 74 client-limited mixed traffic : 4, 7, 89 21m+10: 11kcps, 36 Gbps Thus it seems that we always need a bit of bulk tasks even for short connections, which seems to imply a suboptimal processing somewhere, and that there are roughly twice as many tasks (TC1=normal) as regular tasklets (TC0=urgent). This ratio stands even when data forwarding increases. So at first glance it looks reasonable to enforce the following ratio by default: - 16% for TL_URGENT - 33% for TL_NORMAL - 50% for TL_BULK With this, the TCP conn rate climbs to ~225 kcps, and the mixed traffic pattern shows a more balanced 17kcps + 35 Gbps with 35ms CLI request time time instead of 11kcps + 36 Gbps and 400 ms response time. The byte rate tests (1M objects) are not affected at all. This setting looks "good enough" to allow immediate merging, and could be refined later. It's worth noting that it resists very well to massive increase of run queue depth and maxpollevents: with the run queue depth changed from 200 to 10000 and maxpollevents to 10000 as well, the CLI's request time is back to the previous ~400ms, but the mixed traffic test reaches 52 Gbps + 7500 CPS, which was never met with the previous scheduling model, while the CLI used to show ~1 minute response time. The reason is that in the bulk class it becomes possible to perform multiple rounds of recv+send and eliminate objects at once, increasing the L3 cache hit ratio, and keeping the connection count low, without degrading too much the latency. Another test with mixed traffic involving 2/3 splicing on huge objects and 1/3 on empty objects without touching any setting reports 51 Gbps + 5300 cps and 35ms CLI request time.	2020-01-31 07:09:10 +01:00
Willy Tarreau	a62917b890	MEDIUM: tasks: implement 3 different tasklet classes with their own queues We used to mix high latency tasks and low latency tasklets in the same list, and to even refill bulk tasklets there, causing some unfairness in certain situations (e.g. poll-less transfers between many connections saturating the machine with similarly-sized in and out network interfaces). This patch changes the mechanism to split the load into 3 lists depending on the task/tasklet's desired classes : - URGENT: this is mainly for tasklets used as deferred callbacks - NORMAL: this is for regular tasks - BULK: this is for bulk tasks/tasklets Arbitrary ratios of max_processed are picked from each of these lists in turn, with the ability to complete in one list from what was not picked in the previous one. After some quick tests, the following setup gave apparently good results both for raw TCP with splicing and for H2-to-H1 request rate: - 0 to 75% for urgent - 12 to 50% for normal - 12 to what remains for bulk Bulk is not used yet.	2020-01-30 18:59:33 +01:00
Willy Tarreau	4ffa0b526a	MINOR: tasks: move the list walking code to its own function New function run_tasks_from_list() will run over a tasklet list and will run all the tasks and tasklets it finds there within a limit of <max> that is passed in arggument. This is a preliminary work for scheduler QoS improvements.	2020-01-30 18:13:13 +01:00
Willy Tarreau	876b411f2b	BUG/MEDIUM: pipe/thread: fix atomicity of pipe counters Previous patch `160287b676` ("MEDIUM: pipe/thread: maintain a per-thread local cache of recently used pipes") didn't replace all pipe counter updates with atomic ops since some were already under a lock, which is obviously not a valid reason since these ones can be updated in parallel to other atomic ops. The result was that the pipes_used could seldom be seen as negative in the stats (harmless) but also this could result in slightly more pipes being allocated than permitted, thus stealing a few file descriptors that were not usable for connections anymore. Let's use pure atomic ops everywhere these counters are updated. No backport is needed.	2020-01-30 09:15:37 +01:00
Willy Tarreau	160287b676	MEDIUM: pipe/thread: maintain a per-thread local cache of recently used pipes In order to completely remove the pipe locking cost and try to reuse hot pipes, each thread now maintains a local cache of recently used pipes that is no larger than its share (maxpipes/nbthreads). All extra pipes are instead refilled into the global pool. Allocations are made from the local pool first, and fall back to the global one before allocating one. This completely removes the observed pipe locking cost at high bit rates, which was still around 5-6%.	2020-01-29 11:12:07 +01:00
Willy Tarreau	a945cfdfe0	MEDIUM: pipe/thread: reduce the locking overhead In a quick test involving splicing, we can see that get_pipe() and put_pipe() together consume up to 12% of the CPU. That's not surprizing considering how much work is performed under the lock, including the pipe struct allocation, the pipe creation and its initialization. Same for releasing, we don't need a lock there to call close() nor to free to the pool. Changing this alone was enough to cut the overhead in half. A better approach should consist in having a per-thread pipe cache, which will also help keep pages hot in the CPU caches.	2020-01-29 10:44:00 +01:00
William Lallemand	a25a19fdee	BUG/MINOR: ssl/cli: fix unused variable with openssl < 1.0.2 src/ssl_sock.c: In function ‘cli_io_handler_show_cert’: src/ssl_sock.c:10214:6: warning: unused variable ‘n’ [-Wunused-variable] int n; ^ Fix this problem in the io handler of the "show ssl cert" function.	2020-01-29 00:08:10 +01:00
Willy Tarreau	1113116b4a	MEDIUM: raw-sock: remove obsolete calls to fd_{cant,cond,done}_{send,recv} Given that raw_sock's functions solely act on connections and that all its callers properly use subscribe() when they want to receive/send more, there is no more reason for calling fd_{cant,cond,done}_{send,recv} anymore as this call is immediately overridden by the subscribe call. It's also worth noting that the purpose of fd_cond_recv() whose purpose was to speculatively enable reading in the FD cache if the FD was active but not yet polled was made to save on expensive epoll_ctl() calls and was implicitly covered more cleanly by recent commit `5d7dcc2a8e` ("OPTIM: epoll: always poll for recv if neither active nor ready"). No change on the number of calls to epoll_ctl() was noticed consecutive to this change.	2020-01-28 19:06:41 +01:00
William Dauchy	1e2256d4d3	MINOR: proxy: clarify number of connections log when stopping this log could be sometimes a bit confusing (depending on the number in fact) when you read it (e.g is it the number of active connection?) - only trained eyes knows haproxy output a different log when closing active connections while stopping. Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-01-28 13:10:03 +01:00
William Dauchy	aecd5dcac2	BUG/MINOR: dns: allow 63 char in hostname hostname were limited to 62 char, which is not RFC1035 compliant; - the parsing loop should stop when above max label char - fix len label test where d[i] was wrongly used - simplify the whole function to avoid using two extra char* variable this should fix github issue #387 Signed-off-by: William Dauchy <w.dauchy@criteo.com> Reviewed-by: Tim Duesterhus <tim@bastelstu.be> Acked-by: Baptiste <bedis9@gmail.com>	2020-01-28 13:08:08 +01:00
William Dauchy	bd8bf67102	BUG/MINOR: connection: fix ip6 dst_port copy in make_proxy_line_v2 triggered by coverity; src_port is set earlier. this should fix github issue #467 Fixes: `7fec021537` ("MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed") This should be backported to 1.8. Signed-off-by: William Dauchy <w.dauchy@criteo.com> Reviewed-by: Tim Duesterhus <tim@bastelstu.be>	2020-01-28 13:02:58 +01:00
Christopher Faulet	c20b37112b	BUG/MINOR: http-rules: Always init log-format expr for common HTTP actions Many HTTP actions rely on <.arg.http> in the act_rule structure. Not all actions use the log-format expression, but it must be initialized anyway. Otherwise, HAProxy may crash during the deinit when the release function is called. No backport needed. This patch should fix issue #468.	2020-01-27 15:51:57 +01:00
Willy Tarreau	74ab7d2b80	BUG/MINOR: tcpchecks: fix the connect() flags regarding delayed ack In issue #465, we see that Coverity detected dead code in checks.c which is in fact a missing parenthesis to build the connect() flags consecutive to the API change in commit `fdcb007ad8` ("MEDIUM: proto: Change the prototype of the connect() method."). The impact should be imperceptible as in the best case it may have resulted in a missed optimization trying to save a syscall or to merge outgoing packets. It may be backported as far as 2.0 though it's not critical.	2020-01-24 17:52:37 +01:00
Olivier Houchard	1fc5a648bf	MEDIUM: streams: Don't close the connection in back_handle_st_rdy(). In back_handle_st_rdy(), don't bother trying to close the connection, it should be taken care of somewhere else.	2020-01-24 15:40:34 +01:00
Olivier Houchard	7c30642ede	MEDIUM: streams: Don't close the connection in back_handle_st_con(). In back_handle_st_con(), don't bother trying to close the connection, it should be taken care of elsewhere.	2020-01-24 15:40:34 +01:00
Olivier Houchard	b43589cac5	BUG/MEDIUM: stream: Don't install the mux in back_handle_st_con(). In back_handle_st_con(), don't bother setting up the mux, it is now done by conn_fd_handler().	2020-01-24 15:40:34 +01:00
Olivier Houchard	efe5e8e998	BUG/MEDIUM: ssl: Don't forget to free ctx->ssl on failure. In ssl_sock_init(), if we fail to allocate the BIO, don't forget to free the SSL *, or we'd end up with a memory leak. This should be backported to 2.1 and 2.0.	2020-01-24 15:17:38 +01:00
Olivier Houchard	6d53cd6978	MINOR: ssl: Remove dead code. Now that we don't call the handshake function directly, but merely wake the tasklet, we can no longer have CO_FL_ERR, so don't bother checking it.	2020-01-24 15:13:57 +01:00
Frédéric Lécaille	3139c1b198	BUG/MINOR: ssl: Possible memleak when allowing the 0RTT data buffer. As the server early data buffer is allocated in the middle of the loop used to allocate the SSL session without being freed before retrying, this leads to a memory leak. To fix this we move the section of code responsible of this early data buffer alloction after the one reponsible of allocating the SSL session. Must be backported to 2.1 and 2.0.	2020-01-24 15:12:21 +01:00
Olivier Houchard	ecffb7d841	BUG/MEDIUM: streams: Move the conn_stream allocation outside #IF USE_OPENSSL. When commit `477902bd2e` made the conn_stream allocation unconditional, it unfortunately moved the code doing the allocation inside #if USE_OPENSSL, which means anybody compiling haproxy without openssl wouldn't allocate any conn_stream, and would get a segfault later. Fix that by moving the code that does the allocation outside #if USE_OPENSSL.	2020-01-24 14:14:35 +01:00
Christopher Faulet	99ac8a1aa4	BUG/MINOR: stream: Be sure to have a listener to increment its counters In process_stream(), when a client or a server abort is handled, the corresponding listener's counter is incremented. But, we must be sure to have a listener attached to the session. This bug was introduced by the commit `cff0f739e5`. Thanks to Fred to reporting me the bug. No need to backport this patch, except if commit `cff0f739e5` is backported.	2020-01-24 11:55:17 +01:00
Christopher Faulet	be20cf36af	BUG/MINOR: http-ana: Increment the backend counters on the backend A stupid cut-paste bug was introduced in the commit `cff0f739e5`. Backend counters must of course be incremented on the stream's backend. Not the frontend. No need to backport this patch, except if commit `cff0f739e5` is backported.	2020-01-24 11:55:17 +01:00
Willy Tarreau	645c588e71	BUILD: cfgparse: silence a bogus gcc warning on 32-bit machines A first patch was made during 2.0-dev to silence a bogus warning emitted by gcc : `dd1c8f1f72` ("MINOR: cfgparse: Add a cast to make gcc happier."), but it happens it was not sufficient as the warning re-appeared on 32-bit machines under gcc-8 and gcc-9 : src/cfgparse.c: In function 'check_config_validity': src/cfgparse.c:3642:33: warning: argument 1 range [2147483648, 4294967295] exceeds maximum object size 2147483647 [-Walloc-size-larger-than=] newsrv->idle_orphan_conns = calloc((unsigned int)global.nbthread, sizeof(*newsrv->idle_orphan_conns)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This warning doesn't trigger in other locations, and it immediately vanishes if the previous or subsequent loops do not depend on global.nbthread anymore, or if the field ordering of the struct server changes! As discussed in the thread at: https://www.mail-archive.com/haproxy@formilux.org/msg36107.html playing with -Walloc-size-larger-than has no effect. And a minimal reproducer could be isolated, indicating it's pointless to circle around this one. Let's just cast nbthread to ushort so that gcc cannot make this wrong detection. It's unlikely we'll use more than 65535 threads in the near future anyway. This may be backported to older releases if they are also affected, at least to ease the job of distro maintainers. Thanks to Ilya for testing.	2020-01-24 11:30:06 +01:00
Tim Duesterhus	541fe1ec52	MINOR: lua: Add HLUA_PREPEND_C?PATH build option This complements the lua-prepend-path configuration option to allow distro maintainers to add a default path for HAProxy specific Lua libraries.	2020-01-24 09:22:03 +01:00
Tim Duesterhus	dd74b5f237	MINOR: lua: Add lua-prepend-path configuration option lua-prepend-path allows the administrator to specify a custom Lua library path to load custom Lua modules that are useful within the context of HAProxy without polluting the global Lua library folder.	2020-01-24 09:22:03 +01:00
Tim Duesterhus	c9fc9f2836	MINOR: lua: Add hlua_prepend_path function This function is added in preparation for following patches.	2020-01-24 09:21:35 +01:00
Willy Tarreau	bb2c4ae065	BUG/MEDIUM: mux-h2: make sure we don't emit TE headers with anything but "trailers" While the H2 parser properly checks for the absence of anything but "trailers" in the TE header field, we forget to check this when sending the request to an H2 server. The problem is that an H2->H2 conversion may keep "gzip" and fail on the next stage. This patch makes sure that we only send "TE: trailers" if the TE header contains the "trailers" token, otherwise it's dropped. This fixes issue #464 and should be backported till 1.9.	2020-01-24 09:07:53 +01:00
Willy Tarreau	508d232a06	BUG/MINOR: stktable: report the current proxy name in error messages Since commit `1b8e68e89a` ("MEDIUM: stick-table: Stop handling stick-tables as proxies."), a rule referencing the current proxy with no table leads to the following error : [ALERT] 023/071924 (16479) : Proxy 'px': unable to find stick-table '(null)'. [ALERT] 023/071914 (16479) : Fatal errors found in configuration. for a config like this one: backend px stick on src This patch fixes it and should be backported as far as 2.0.	2020-01-24 07:19:34 +01:00
Willy Tarreau	f22758d12a	MINOR: connection: remove some unneeded checks for CO_FL_SOCK_WR_SH A few places in health checks and stream-int on the send path were still checking for this flag. Now we do not and instead we rely on snd_buf() to report the error if any. It's worth noting that all 3 real muxes still use CO_FL_SOCK_WR_SH and CO_FL_ERROR interchangeably at various places to decide to abort and/or free their data. This should be clarified and fixed so that only CO_FL_ERROR is used, and this will render the error paths simpler and more accurate.	2020-01-23 19:01:37 +01:00
Willy Tarreau	a8c7e8e3a8	MINOR: raw-sock: always check for CO_FL_SOCK_WR_SH before sending The test was added before splice() and send() to make sure we never accidently send after a shutdown, because upper layers do not all check and it's not their job to do it. In such a case we also set errno to EPIPE so that the error can be accurately reported, e.g., in health checks.	2020-01-23 19:01:37 +01:00
Willy Tarreau	49139cb914	MINOR: connection: don't check for CO_FL_SOCK_WR_SH too early in handshakes Just like with CO_FL_SOCK_RD_SH, we don't need to check for this flag too early because conn_sock_send() already does it. No error was lost so it was harmless, it was only useless code.	2020-01-23 19:01:37 +01:00
Willy Tarreau	d838fb840c	MINOR: connection: do not check for CO_FL_SOCK_RD_SH too early The handshake functions dedicated to proxy proto, netscaler and socks4 all check for this flag before proceeding. This is wrong, they must not do and instead perform the call to recv() then report the close. The reason for this is that the current construct managed to lose the CO_ER_CIP_EMPTY error code in case the connection was already shut, thus causing a race condition with some errors being reported correctly or as unknown depending on the timing.	2020-01-23 18:05:18 +01:00
Willy Tarreau	6d015724ec	MINOR: connection: remove checks for CO_FL_HANDSHAKE before I/O There are still leftovers from the pre-xprt_handshake era with lots of places where I/O callbacks refrain from receiving/sending if they see that a handshake is present. This needlessly duplicates the subscribe calls as it will automatically be done by the underlying xprt_handshake code when attempting the operation. The only reason for still checking CO_FL_HANDSHAKE is when we decide to instantiate xprt_handshake. This patch removes all other ones.	2020-01-23 17:30:42 +01:00
Willy Tarreau	911db9bd29	MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE As mentioned in commit `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of consistency on which flags are checked among L4/L6/HANDSHAKE depending on the code areas. A number of sample fetch functions only check for L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and many check both L4L6 and HANDSHAKE. This patch starts to make all of this more consistent by introducing a new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and reports whether the transport layer is ready or not. All inconsistent call places were updated to rely on this one each time the goal was to check for the readiness of the transport layer.	2020-01-23 16:34:26 +01:00
Willy Tarreau	4450b587dd	MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE Most places continue to check CO_FL_HANDSHAKE while in fact they should check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one dedicated to SSL renegotiation. In fact the SSL layer should be the only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data when a renegotiation is in progress, but other ones randomly include it without knowing. And ideally it should even be an internal flag that's not exposed in the connection. This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL. In order to limit the confusion that has accumulated over time, the CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake, possibly used by a renegotiation was moved after the other ones.	2020-01-23 16:34:26 +01:00
Willy Tarreau	18955db43d	MINOR: stream-int: always report received shutdowns As mentioned in `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), si_cs_recv() currently does not propagate CS_FL_EOS to CF_READ_NULL if CO_FL_WAIT_L4L6 is set, while this situation doesn't exist anymore. Let's get rid of this confusing test.	2020-01-23 16:34:26 +01:00
Olivier Houchard	220a26c316	BUG/MEDIUM: 0rtt: Only consider the SSL handshake. We only add the Early-data header, or get ssl_fc_has_early to return 1, if we didn't already did the SSL handshake, as otherwise, we know the early data were fine, and there's no risk of replay attack. But to do so, we wrongly checked CO_FL_HANDSHAKE, we have to check CO_FL_SSL_WAIT_HS instead, as we don't care about the status of any other handshake. This should be backported to 2.1, 2.0, and 1.9. When deciding if we should add the Early-Data header, or if the sample fetch should return	2020-01-23 15:01:11 +01:00
Willy Tarreau	c192b0ab95	MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_* Commit `477902bd2e` ("MEDIUM: connections: Get ride of the xprt_done callback.") broke the master CLI for a very obscure reason. It happens that short requests immediately terminated by a shutdown are properly received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not yet set on the connection since we've not passed again through conn_fd_handler() and it was not done in conn_complete_session(). While commit `a8a415d31a` ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session()") fixed the issue, such accident may happen again as the root cause is deeper and actually comes down to the fact that CO_FL_CONNECTED is lazily set at various check points in the code but not every time we drop one wait bit. It is not the first time we face this situation. Originally this flag was used to detect the transition between WAIT_* and CONNECTED in order to call ->wake() from the FD handler. But since at least 1.8-dev1 with commit `7bf3fa3c23` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is always synchronized against the two others before being checked. Moreover, with the I/Os moved to tasklets, the decision to call the ->wake() function is performed after the I/Os in si_cs_process() and equivalent, which don't care about this transition either. So in essence, checking for CO_FL_CONNECTED has become a lazy wait to check for (CO_FL_WAIT_L4_CONN \| CO_FL_WAIT_L6_CONN), but that always relies on someone else having synchronized it. This patch addresses it once for all by killing this flag and only checking the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This revealed a number of inconsistencies that were purposely not addressed here for the sake of bisectability: - while most places do check both L4+L6 and HANDSHAKE at the same time, some places like assign_server() or back_handle_st_con() and a few sample fetches looking for proxy protocol do check for L4+L6 but don't care about HANDSHAKE ; these ones will probably fail on TCP request session rules if the handshake is not complete. - some handshake handlers do validate that a connection is established at L4 but didn't clear CO_FL_WAIT_L4_CONN - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6 before declaring the mux ready while the snd_buf function also checks for the handshake's completion. Likely the former should validate the handshake as well and we should get rid of these extra tests in snd_buf. - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only later clear CO_FL_WAIT_L4_CONN. - xprt_handshake would set CO_FL_CONNECTED itself without actually clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if waiting for a pure Rx handshake. - most places in ssl_sock that were checking CO_FL_CONNECTED don't need to include the L4 check as an L6 check is enough to decide whether to wait for more info or not. It also becomes obvious when reading the test in si_cs_recv() that caused the failure mentioned above that once converted it doesn't make any sense anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete cannot happen since for CS_FL_EOS to be set, the other ones must have been validated. Some of these parts will still deserve further cleanup, and some of the observations above may induce some backports of potential bug fixes once totally analyzed in their context. The risk of breaking existing stuff is too high to blindly backport everything.	2020-01-23 14:41:37 +01:00
Emmanuel Hocdet	078156d063	BUG/MINOR: ssl/cli: ocsp_issuer must be set w/ "set ssl cert" ocsp_issuer is primary set from ckch->chain when PEM is loaded from file, but not set when PEM is loaded via CLI payload. Set ckch->ocsp_issuer in ssl_sock_load_pem_into_ckch to fix that. Should be backported in 2.1.	2020-01-23 14:33:14 +01:00
Olivier Houchard	a8a415d31a	BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session(). We can't just assume conn_create_mux() will be called, and set CO_FL_CONNECTED, conn_complete_session() might be call synchronously if we're not using SSL, so ew haee no choice but to set CO_FL_CONNECTED in there. This should fix the recent breakage of the mcli reg tests.	2020-01-23 13:20:03 +01:00
William Lallemand	dad239d08b	BUG/MINOR: ssl: typo in previous patch The previous patch `5c3c96f` ("BUG/MINOR: ssl: memory leak w/ the ocsp_issuer") contains a typo that prevent it to build. Should be backported in 2.1.	2020-01-23 11:59:02 +01:00
William Lallemand	5c3c96fd36	BUG/MINOR: ssl: memory leak w/ the ocsp_issuer This patch frees the ocsp_issuer in ssl_sock_free_cert_key_and_chain_contents(). Shoudl be backported in 2.1.	2020-01-23 11:57:39 +01:00
William Lallemand	b829dda57b	BUG/MINOR: ssl: increment issuer refcount if in chain When using the OCSP response, if the issuer of the response is in the certificate chain, its address will be stored in ckch->ocsp_issuer. However, since the ocsp_issuer could be filled by a separate file, this pointer is free'd. The refcount of the X509 need to be incremented to avoid a double free if we free the ocsp_issuer AND the chain.	2020-01-23 11:57:39 +01:00
Willy Tarreau	027d206b57	CLEANUP: stats: shut up a wrong null-deref warning from gcc 9.2 As reported in bug #447, gcc 9.2 invents impossible code paths and then complains that we don't check for our pointers to be NULL... This code path is not critical, better add the test to shut it up than try to help it being less creative. This code hasn't changed for a while, so it could help distros to backport this to older releases.	2020-01-23 11:49:02 +01:00
Willy Tarreau	79fd577ac1	CLEANUP: backend: shut another false null-deref in back_handle_st_con() objt_conn() may return a NULL though here we don't have this situation anymore since the connection is always there, so let's simply switch to the unchecked __objt_conn(). This addresses issue #454.	2020-01-23 11:40:40 +01:00
Willy Tarreau	b1a40c72e7	CLEANUP: backend: remove useless test for inexistent connection Coverity rightfully reported that it's pointless to test for "conn" to be null while all code paths leading to it have already dereferenced it. This addresses issue #461.	2020-01-23 11:37:43 +01:00
William Lallemand	75b15f790f	BUG/MINOR: ssl/cli: free the previous ckch content once a PEM is loaded When using "set ssl cert" on the CLI, if we load a new PEM, the previous sctl, issuer and OCSP response are still loaded. This doesn't make any sense since they won't be usable with a new private key. This patch free the previous data. Should be backported in 2.1.	2020-01-23 11:08:46 +01:00
Adis Nezirovic	d0142e7224	MINOR: cli: Report location of errors or any extra data for "show table" When using multiple filters with "show table", it can be useful to report which filter entry failed > show table MY_TABLE data.gpc0 gt 0 data.gpc0a lt 1000 Filter entry #2: Unknown data type > show table MY_TABLE data.gpc0 gt 0 data.gpc0 lt 1000a Filter entry #2: Require a valid integer value to compare against We now also catch garbage data after the filter > show table MY_TABLE data.gpc0 gt 0 data.gpc0 lt 1000 data.gpc0 gt 1\ data.gpc0 gt 10 a Detected extra data in filter, 16th word of input, after '10' Even before multi-filter feature we've also silently accepted garbage after the input, hiding potential bugs > show table MY_TABLE data.gpc0 gt 0 data.gpc0 or > show table MY_TABLE data.gpc0 gt 0 a In both cases, only first filter entry would be used, silently ignoring extra filter entry or garbage data. Last, but not the least, it is now possible to detect multi-filter feature from cli with something like the following: > show table MY_TABLE data.blah Filter entry #1: Unknown data type	2020-01-23 10:43:52 +01:00
Olivier Houchard	477902bd2e	MEDIUM: connections: Get ride of the xprt_done callback. The xprt_done_cb callback was used to defer some connection initialization until we're connected and the handshake are done. As it mostly consists of creating the mux, instead of using the callback, introduce a conn_create_mux() function, that will just call conn_complete_session() for frontend, and create the mux for backend. In h2_wake(), make sure we call the wake method of the stream_interface, as we no longer wakeup the stream task.	2020-01-22 18:56:05 +01:00
Olivier Houchard	8af03b396a	MEDIUM: streams: Always create a conn_stream in connect_server(). In connect_server(), when creating a new connection for which we don't yet know the mux (because it'll be decided by the ALPN), instead of associating the connection to the stream_interface, always create a conn_stream. This way, we have less special-casing needed. Store the conn_stream in conn->ctx, so that we can reach the upper layers if needed.	2020-01-22 18:55:59 +01:00
Adis Nezirovic	56dd354b3c	BUG/MINOR: cli: Missing arg offset for filter data values. We don't properly check for missing data values for additional filter entries, passing out of bounds index to args[], then passing to strlen. Introduced in commit `1a693fc2`: (MEDIUM: cli: Allow multiple filter entries for "show table")	2020-01-22 18:09:06 +01:00

... 5 6 7 8 9 ...

9436 Commits