haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-13 10:37:02 +02:00

Author	SHA1	Message	Date
William Lallemand	fa8cf0c476	MINOR: ssl: store a ptr to crtlist in crtlist_entry Store a pointer to crtlist in crtlist_entry so we can re-insert a crtlist_entry in its crtlist ebpt after updating its key.	2020-03-31 12:32:17 +02:00
William Lallemand	23d61c00b9	MINOR: ssl: add a list of crtlist_entry in ckch_store When updating a ckch_store we may want to update its pointer in the crtlist_entry which use it. To do this, we need the list of the entries using the store.	2020-03-31 12:32:17 +02:00
William Lallemand	493983128b	BUG/MINOR: ssl: ckch_inst wrongly inserted in crtlist_entry The instances were wrongly inserted in the crtlist entries, all instances of a crt-list were inserted in the last crt-list entry. Which was kind of handy to free all instances upon error. Now that it's done correctly, the error path was changed, it must iterate on the entries and find the ckch_insts which were generated for this bind_conf. To avoid wasting time, it stops the iteration once it found the first unsuccessful generation.	2020-03-31 12:32:17 +02:00
William Lallemand	ad3c37b760	REORG: ssl: move SETCERT enum to ssl_sock.h Move the SETCERT enum at the right place to cleanup ssl_sock.c.	2020-03-31 12:32:17 +02:00
William Lallemand	79d31ec0d4	MINOR: ssl: add a list of bind_conf in struct crtlist In order to be able to add new certificate in a crt-list, we need the list of bind_conf that uses this crt-list so we can create a ckch_inst for each of them.	2020-03-31 12:32:17 +02:00
William Lallemand	638f6ad033	MINOR: cli: add a general purpose pointer in the CLI struct This patch adds a p2 generic pointer which is inialized to zero before calling the parser.	2020-03-31 12:32:17 +02:00
Olivier Houchard	cf612a0457	MINOR: servers: Add a counter for the number of currently used connections. Add a counter to know the current number of used connections, as well as the max, this will be used later to refine the algorithm used to kill idle connections, based on current usage.	2020-03-30 00:30:01 +02:00
Jerome Magnin	824186bb08	MEDIUM: stream: support use-server rules with dynamic names With server-template was introduced the possibility to scale the number of servers in a backend without needing a configuration change and associated reload. On the other hand it became impractical to write use-server rules for these servers as they would only accept existing server labels as argument. This patch allows the use of log-format notation to describe targets of a use-server rules, such as in the example below: listen test bind *:1234 use-server %[hdr(srv)] if { hdr(srv) -m found } use-server s1 if { path / } server s1 127.0.0.1:18080 server s2 127.0.0.1:18081 If a use-server rule is applied because it was conditionned by an ACL returning true, but the target of the use-server rule cannot be resolved, no other use-server rule is evaluated and we fall back to load balancing. This feature was requested on the ML, and bumped with issue #563.	2020-03-29 09:55:10 +02:00
Olivier Houchard	dbda31939d	BUG/MINOR: connections: Set idle_time before adding to idle list. In srv_add_to_idle_list(), make sure we set the idle_time before we add the connection to an idle list, not after, otherwise another thread may grab it, set the idle_time to 0, only to have the original thread set it back to now_ms. This may have an impact, as in conn_free() we check idle_time to decide if we should decrement the idle connection counters for the server.	2020-03-22 20:05:59 +01:00
Olivier Houchard	ad91124bcf	BUILD/MEDIUM: fd: Declare fd_mig_lock as extern. Declare fd_mig_lock as extern so that it isn't defined multiple times. This should fix build for architectures without double-width CAS.	2020-03-20 11:42:11 +01:00
Olivier Houchard	566df309c6	MEDIUM: connections: Attempt to get idle connections from other threads. In connect_server(), if we no longer have any idle connections for the current thread, attempt to use the new "takeover" mux method to steal a connection from another thread. This should have no impact right now, given no mux implements it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	d2489e00b0	MINOR: connections: Add a flag to know if we're in the safe or idle list. Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one know we are in the safe list, or the idle list.	2020-03-19 22:07:33 +01:00
Olivier Houchard	f0d4dff25c	MINOR: connections: Make the "list" element a struct mt_list instead of list. Make the "list" element a struct mt_list, and explicitely use list_from_mt_list to get a struct list * where it is used as such, so that mt_list_for_each_entry will be usable with it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	00bdce24d5	MINOR: connections: Add a new mux method, "takeover". Add a new mux method, "takeover", that will attempt to make the current thread responsible for the connection. It should return 0 on success, and non-zero on failure.	2020-03-19 22:07:33 +01:00
Olivier Houchard	8851664293	MINOR: fd: Implement fd_takeover(). Implement a new function, fd_takeover(), that lets you become the thread responsible for the fd. On architectures that do not have a double-width CAS, use a global rwlock. fd_set_running() was also changed to be able to compete with fd_takeover(), either using a dooble-width CAS on both running_mask and thread_mask, or by claiming a reader on the global rwlock. This extra operation should not have any measurable impact on modern architectures where threading is relevant.	2020-03-19 22:07:33 +01:00
Olivier Houchard	dc2f2753e9	MEDIUM: servers: Split the connections into idle, safe, and available. Revamp the server connection lists. We know have 3 lists : - idle_conns, which contains idling connections - safe_conns, which contains idling connections that are safe to use even for the first request - available_conns, which contains connections that are not idling, but can still accept new streams (those are HTTP/2 or fastcgi, and are always considered safe).	2020-03-19 22:07:33 +01:00
Olivier Houchard	2444aa5b66	MEDIUM: sessions: Don't be responsible for connections anymore. Make it so sessions are not responsible for connection anymore, except for connections that are private, and thus can't be shared, otherwise, as soon as a request is done, the session will just add the connection to the orphan connections pool. This will break http-reuse safe, but it is expected to be fixed later.	2020-03-19 22:07:33 +01:00
Olivier Houchard	899fb8abdc	MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc. The flush_lock was introduced, mostly to be sure that pool_gc() will never dereference a pointer that has been free'd. __pool_get_first() was acquiring the lock to, the fear was that otherwise that pointer could get free'd later, and then pool_gc() would attempt to dereference it. However, that can not happen, because the only functions that can free a pointer, when using lockless pools, are pool_gc() and pool_flush(), and as long as those two are mutually exclusive, nobody will be able to free the pointer while pool_gc() attempts to access it. So change the flush_lock to a spinlock, and don't bother acquire/release it in __pool_get_first(), that way callers of __pool_get_first() won't have to wait while the pool is flushed. The worst that can happen is we call __pool_refill_alloc() while the pool is getting flushed, and memory can get allocated just to be free'd. This may help with github issue #552 This may be backported to 2.1, 2.0 and 1.9.	2020-03-18 15:55:35 +01:00
Olivier Houchard	de01ea9878	MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h. Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into types/signal.h, so that we can access them in another file. We need those definition to avoid blocking those signals when running __signal_process_queue(). This should be backported to 2.1, 2.0 and 1.9.	2020-03-18 13:07:19 +01:00
Olivier Houchard	a7bf573520	MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock. In the struct fdtab, introduce a new mask, running_mask. Each thread should add its bit before using the fd. Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just spin as long as the mask is non-zero, to be sure we access the data exclusively. fd_set_running_excl() spins until the mask is 0, fd_set_running() just adds the thread bit, and fd_clr_running() removes it.	2020-03-17 15:30:07 +01:00
William Lallemand	2954c478eb	MEDIUM: ssl: allow crt-list caching The crtlist structure defines a crt-list in the HAProxy configuration. It contains crtlist_entry structures which are the lines in a crt-list file. crt-list are now loaded in memory using crtlist and crtlist_entry structures. The file is read only once. The generation algorithm changed a little bit, new ckch instances are generated from the crtlist structures, instead of being generated during the file loading. The loading function was split in two, one that loads and caches the crt-list and certificates, and one that looks for a crt-list and creates the ckch instances. Filters are also stored in crtlist_entry->filters as a char ** so we can generate the sni_ctx again if needed. I won't be needed anymore to parse the sni_ctx to do that. A crtlist_entry stores the list of all ckch_inst that were generated from this entry.	2020-03-16 16:18:49 +01:00
Willy Tarreau	e4d42551bd	BUILD: pools: silence build warnings with DEBUG_MEMORY_POOLS and DEBUG_UAF With these debug options we still get these warnings: include/common/memory.h:501:23: warning: null pointer dereference [-Wnull-dereference] (volatile int )0 = 0; ~~~~~~~~~~~~~~~~~~~^~~ include/common/memory.h:460:22: warning: null pointer dereference [-Wnull-dereference] (volatile int )0 = 0; ~~~~~~~~~~~~~~~~~~~^~~ These are purposely there to crash the process at specific locations. But the annoying warnings do not help with debugging and they are not even reliable as the compiler may decide to optimize them away. Let's pass the pointer through DISGUISE() to avoid this.	2020-03-14 11:10:21 +01:00
Willy Tarreau	2e8ab6b560	MINOR: use DISGUISE() everywhere we deliberately want to ignore a result It's more generic and versatile than the previous shut_your_big_mouth_gcc() that was used to silence annoying warnings as it's not limited to ignoring syscalls returns only. This allows us to get rid of the aforementioned function and the shut_your_big_mouth_gcc_int variable, that started to look ugly in multi-threaded environments.	2020-03-14 11:04:49 +01:00
Willy Tarreau	15ed69fd3f	MINOR: debug: consume the write() result in BUG_ON() to silence a warning Tim reported that BUG_ON() issues warnings on his distro, as the libc marks some syscalls with __attribute__((warn_unused_result)). Let's pass the write() result through DISGUISE() to hide it.	2020-03-14 10:58:35 +01:00
Willy Tarreau	f401668306	MINOR: debug: add a new DISGUISE() macro to pass a value as identity This does exactly the same as ALREADY_CHECKED() but does it inline, returning an identical copy of the scalar variable without letting the compiler know how it might have been transformed. This can forcefully disable certain null-pointer checks or result checks when known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0 will not cause a null-deref warning.	2020-03-14 10:52:46 +01:00
Ilya Shipitsin	77e3b4a2c4	CLEANUP: assorted typo fixes in the code and comments These are mostly comments in the code. A few error messages were fixed and are of low enough importance not to deserve a backport. Some regtests were also fixed.	2020-03-14 09:42:07 +01:00
Tim Duesterhus	cf6e0c8a83	MEDIUM: proxy_protocol: Support sending unique IDs using PPv2 This patch adds the `unique-id` option to `proxy-v2-options`. If this option is set a unique ID will be generated based on the `unique-id-format` while sending the proxy protocol v2 header and stored as the unique id for the first stream of the connection. This feature is meant to be used in `tcp` mode. It works on HTTP mode, but might result in inconsistent unique IDs for the first request on a keep-alive connection, because the unique ID for the first stream is generated earlier than the others. Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made available in TCP mode.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	d1b15b6e9b	MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections This patch reads a proxy protocol v2 provided unique ID and makes it available using the `fc_pp_unique_id` fetch.	2020-03-13 17:25:23 +01:00
Tim Duesterhus	b435f77620	DOC: proxy_protocol: Reserve TLV type 0x05 as PP2_TYPE_UNIQUE_ID This reserves and defines TLV type 0x05.	2020-03-13 17:25:23 +01:00
Olivier Houchard	84fd8a77b7	MINOR: lists: fix indentation. Fix indentation in the recently added list_to_mt_list().	2020-03-11 21:41:13 +01:00
Olivier Houchard	8676514d4e	MINOR: servers: Kill priv_conns. Remove the list of private connections from server, it has been largely unused, we only inserted connections in it, but we would never actually use it.	2020-03-11 19:20:01 +01:00
Olivier Houchard	751e5e21a9	MINOR: lists: Implement function to convert list => mt_list and mt_list => list Implement mt_list_to_list() and list_to_mt_list(), to be able to convert from a struct list to a struct mt_list, and vice versa. This is normally of no use, except for struct connection's list field, that can go in either a struct list or a struct mt_list.	2020-03-11 17:10:40 +01:00
Olivier Houchard	49983a9fe1	MINOR: mt_lists: Appease gcc. gcc is confused, and think p may end up being NULL in _MT_LIST_RELINK_DELETED. It should never happen, so let gcc know that.	2020-03-11 17:10:08 +01:00
Willy Tarreau	638698da37	BUILD: stream-int: fix a few includes dependencies The stream-int code doesn't need to load server.h as it doesn't use servers at all. However removing this one reveals that proxy.h was lacking types/checks.h that used to be silently inherited from types/server.h loaded before in stream_interface.h.	2020-03-11 14:15:33 +01:00
Willy Tarreau	855796bdc8	BUG/MAJOR: list: fix invalid element address calculation Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10 (pre-release). It turns out that constructs such as: while (item != head) { item = LIST_ELEM(item.n); } loop forever, never matching <item> to <head> despite a printf there showing them equal. In practice the problem is that the LIST_ELEM() macro is wrong, it assigns the subtract of two pointers (an integer) to another pointer through a cast to its pointer type. And GCC 10 now considers that this cannot match a pointer and silently optimizes the comparison away. A tested workaround for this is to build with -fno-tree-pta. Note that older gcc versions even with -ftree-pta do not exhibit this rather surprizing behavior. This patch changes the test to instead cast the null-based address to an int to get the offset and subtract it from the pointer, and this time it works. There were just a few places to adjust. Ideally offsetof() should be used but the LIST_ELEM() API doesn't make this trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*) thus it would require to completely change the whole API, which is not something workable in the short term, especially for a backport. With this change, the emitted code is subtly different even on older versions. A code size reduction of ~600 bytes and a total executable size reduction of ~1kB are expected to be observed and should not be taken as an anomaly. Typically this loop in dequeue_proxy_listeners() : while ((listener = MT_LIST_POP(...))) used to produce this code where the comparison is performed on RAX while the new offset is assigned to RDI even though both are always identical: 53ded8: 48 8d 78 c0 lea -0x40(%rax),%rdi 53dedc: 48 83 f8 40 cmp $0x40,%rax 53dee0: 74 39 je 53df1b <dequeue_proxy_listeners+0xab> and now produces this one which is slightly more efficient as the same register is used for both purposes: 53dd08: 48 83 ef 40 sub $0x40,%rdi 53dd0c: 74 2d je 53dd3b <dequeue_proxy_listeners+0x9b> Similarly, retrieving the channel from a stream_interface using si_ic() and si_oc() used to cause this (stream-int in rdi): 1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi) 1cbe: f6 47 04 10 testb $0x10,0x4(%rdi) 1cc2: 74 1c je 1ce0 <si_report_error+0x30> 1cc4: 48 81 ef 00 03 00 00 sub $0x300,%rdi 1ccb: 81 4f 10 00 08 00 00 orl $0x800,0x10(%rdi) and now causes this: 1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi) 1cbe: f6 47 04 10 testb $0x10,0x4(%rdi) 1cc2: 74 1c je 1ce0 <si_report_error+0x30> 1cc4: 81 8f 10 fd ff ff 00 orl $0x800,-0x2f0(%rdi) There is extremely little chance that this fix wakes up a dormant bug as the emitted code effectively does what the source code intends. This must be backported to all supported branches (dropping MT_LIST_ELEM and the spoa_example parts as needed), since the bug is subtle and may not always be visible even when compiling with gcc-10.	2020-03-11 14:12:51 +01:00
Olivier Houchard	1d117e3dcd	BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL; In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable instead of using the provided pointer directly, we shouldn't have changed the code that set the pointer to NULL, as we really want the pointer provided to be nullified, otherwise other parts of the code won't know we just deleted an element, and bad things will happen. This should be backported to 2.1.	2020-03-10 17:45:05 +01:00
Willy Tarreau	9a0dfa5298	CLEANUP: remove the now unused common/syscall.h It was added 9 years ago to implement USE_MY_SPLICE on some libcs where syscall() was bogus. It's about time to get rid of this.	2020-03-10 07:28:46 +01:00
Willy Tarreau	06c63aec95	CLEANUP: remove support for USE_MY_SPLICE The splice() syscall has been supported in glibc since version 2.5 issued in 2006 and is present on supported systems so there's no need for having our own arch-specific syscall definitions anymore.	2020-03-10 07:23:41 +01:00
Willy Tarreau	3858b122a6	CLEANUP: remove support for USE_MY_EPOLL This was made to support epoll on patched 2.4 kernels, and on early 2.6 using alternative libcs thanks to the arch-specific syscall definitions. All the features we support have been around since 2.6.2 and present in glibc since 2.3.2, neither of which are found in field anymore. Let's simply drop this and use epoll normally.	2020-03-10 07:08:10 +01:00
Willy Tarreau	618ac6ea52	CLEANUP: drop support for USE_MY_ACCEPT4 The accept4() syscall has been present for a while now, there is no more reason for maintaining our own arch-specific syscall implementation for systems lacking it in libc but having it in the kernel.	2020-03-10 07:02:46 +01:00
Willy Tarreau	c3e926bf3b	CLEANUP: remove support for Linux i686 vsyscalls This was introduced 10 years ago to squeeze a few CPU cycles per syscall on 32-bit x86 machines and was already quite old by then, requiring to explicitly enable support for this in the kernel. We don't even know if it still builds, let alone if it works at all on recent kernels! Let's completely drop this now.	2020-03-10 06:55:52 +01:00
William Lallemand	6763016866	BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"), HAProxy generates a list of filters based on the sni_ctx in memory. However it's not always relevant, sometimes no filters were configured and the CN/SAN in the new certificate are not the same. This patch fixes the issue by using a flag filters in the ckch_inst, so we are able to know if there were filters or not. In the late case it uses the CN/SAN of the new certificate to generate the sni_ctx. note: filters are still only used in the crt-list atm.	2020-03-09 17:32:04 +01:00
William Lallemand	0a52846603	CLEANUP: ssl: is_default is a bit in ckch_inst The field is_default becomes a bit in the ckch_inst structure.	2020-03-09 17:32:04 +01:00
Miroslav Zagorac	d7dc67ba1d	CLEANUP: remove unused code in 'my_ffsl/my_flsl' functions Shifting the variable 'a' one bit to the right has no effect on the result of the functions.	2020-03-09 14:47:27 +01:00
Willy Tarreau	ee3bcddef7	MINOR: tools: add a generic function to generate UUIDs We currently have two UUID generation functions, one for the sample fetch and the other one in the SPOE filter. Both were a bit complicated since they were made to support random() implementations returning an arbitrary number of bits, and were throwing away 33 bits every 64. Now we don't need this anymore, so let's have a generic function consuming 64 bits at once and use it as appropriate.	2020-03-08 18:04:16 +01:00
Willy Tarreau	52bf839394	BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG This is the replacement of failed attempt to add thread safety and per-process sequences of random numbers initally tried with commit `1c306aa84d` ("BUG/MEDIUM: random: implement per-thread and per-process random sequences"). This new version takes a completely different approach and doesn't try to work around the horrible OS-specific and non-portable random API anymore. Instead it implements "xoroshiro128*", a reputedly high quality random number generator, which is one of the many variants of xorshift, which passes all quality tests and which is described here: http://prng.di.unimi.it/ While not cryptographically secure, it is fast and features a 2^128-1 period. It supports fast jumps allowing to cut the period into smaller non-overlapping sequences, which we use here to support up to 2^32 processes each having their own, non-overlapping sequence of 2^96 numbers (~710^28). This is enough to provide 1 billion randoms per second and per process for 2200 billion years. The implementation was made thread-safe either by using a double 64-bit CAS on platforms supporting it (x86_64, aarch64) or by using a local lock for the time needed to perform the shift operations. This ensures that all threads pick numbers from the same pool so that it is not needed to assign per-thread ranges. For processes we use the fast jump method to advance the sequence by 2^96 for each process. Before this patch, the following config: global nbproc 8 frontend f bind :4445 mode http log stdout format raw daemon log-format "%[uuid] %pid" redirect location / Would produce this output: a4d0ad64-2645-4b74-b894-48acce0669af 12987 a4d0ad64-2645-4b74-b894-48acce0669af 12992 a4d0ad64-2645-4b74-b894-48acce0669af 12986 a4d0ad64-2645-4b74-b894-48acce0669af 12988 a4d0ad64-2645-4b74-b894-48acce0669af 12991 a4d0ad64-2645-4b74-b894-48acce0669af 12989 a4d0ad64-2645-4b74-b894-48acce0669af 12990 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992 82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986 (...) And now produces: f94b29b3-da74-4e03-a0c5-a532c635bad9 13011 47470c02-4862-4c33-80e7-a952899570e5 13014 86332123-539a-47bf-853f-8c8ea8b2a2b5 13013 8f9efa99-3143-47b2-83cf-d618c8dea711 13012 3cc0f5c7-d790-496b-8d39-bec77647af5b 13015 3ec64915-8f95-4374-9e66-e777dc8791e0 13009 0f9bf894-dcde-408c-b094-6e0bb3255452 13011 49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014 e23f6f2e-35c5-4433-a294-b790ab902653 13012 There are multiple benefits to using this method. First, it doesn't depend anymore on a non-portable API. Second it's thread safe. Third it is fast and more proven than any hack we could attempt to try to work around the deficiencies of the various implementations around. This commit depends on previous patches "MINOR: tools: add 64-bit rotate operators" and "BUG/MEDIUM: random: initialize the random pool a bit better", all of which will need to be backported at least as far as version 2.0. It doesn't require to backport the build fixes for circular include files dependecy anymore.	2020-03-08 10:09:02 +01:00
Willy Tarreau	7a40909c00	MINOR: tools: add 64-bit rotate operators This adds rotl64/rotr64 to rotate a 64-bit word by an arbitrary number of bits. It's mainly aimed at being used with constants.	2020-03-08 00:42:18 +01:00
Willy Tarreau	0fbf28a05b	Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences" This reverts commit `1c306aa84d`. It breaks the build on all non-glibc platforms. I got confused by the man page (which possibly is the most confusing man page I've ever read about a standard libc function) and mistakenly understood that random_r was portable, especially since it appears in latest freebsd source as well but not in released versions, and with a slightly different API :-/ We need to find a different solution with a fallback. Among the possibilities, we may reintroduce this one with a fallback relying on locking around the standard functions, keeping fingers crossed for no other library function to call them in parallel, or we may also provide our own PRNG, which is not necessarily more difficult than working around the totally broken up design of the portable API.	2020-03-07 11:24:39 +01:00
Willy Tarreau	1c306aa84d	BUG/MEDIUM: random: implement per-thread and per-process random sequences As mentioned in previous patch, the random number generator was never made thread-safe, which used not to be a problem for health checks spreading, until the uuid sample fetch function appeared. Currently it is possible for two threads or processes to produce exactly the same UUID. In fact it's extremely likely that this will happen for processes, as can be seen with this config: global nbproc 8 frontend f bind :4445 mode http log stdout daemon format raw log-format "%[uuid] %pid" redirect location / It typically produces this log: 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646 07764439-c24d-4e6f-a5a6-0138be59e7a8 30645 07764439-c24d-4e6f-a5a6-0138be59e7a8 30639 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643 07764439-c24d-4e6f-a5a6-0138be59e7a8 30646 b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646 551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642 07764439-c24d-4e6f-a5a6-0138be59e7a8 30642 What this patch does is to use a distinct per-thread and per-process seed to make sure the same sequences will not appear, and will then extend these seeds by "burning" a number of randoms that depends on the global random seed, the thread ID and the process ID. This adds roughly 20 extra bits of randomness, resulting in 52 bits total per thread and per process. It only takes a few milliseconds to burn these randoms and given that threads start with a different seed, we know they will not catch each other. So these random extra bits are essentially added to ensure randomness between boots and cluster instances. This replaces all uses of random() with ha_random() which uses the thread-local state. This must be backported as far as 2.0 or any version having the UUID sample-fetch function since it's the main victim here. It's important to note that this patch, in addition to depending on the previous one "BUG/MEDIUM: init: initialize the random pool a bit better", also depends on the preceeding build fixes to address a circular dependency issue in the include files that prevented it from building. Part or all of these patches may need to be backported or adapted as well.	2020-03-07 06:11:15 +01:00
Willy Tarreau	6c3a681bd6	BUG/MEDIUM: random: initialize the random pool a bit better Since the UUID sample fetch was created, some people noticed that in certain virtualized environments they manage to get exact same UUIDs on different instances started exactly at the same moment. It turns out that the randoms were only initialized to spread the health checks originally, not to provide "clean" randoms. This patch changes this and collects more randomness from various sources, including existing randoms, /dev/urandom when available, RAND_bytes() when OpenSSL is available, as well as the timing for such operations, then applies a SHA1 on all this to keep a 160 bits random seed available, 32 of which are passed to srandom(). It's worth mentioning that there's no clean way to pass more than 32 bits to srandom() as even initstate() provides an opaque state that must absolutely not be tampered with since known implementations contain state information. At least this allows to have up to 4 billion different sequences from the boot, which is not that bad. Note that the thread safety was still not addressed, which is another issue for another patch. This must be backported to all versions containing the UUID sample fetch function, i.e. as far as 2.0.	2020-03-07 06:11:11 +01:00
Willy Tarreau	5a421a8f49	BUILD: listener: types/listener.h must not include standard.h It's only a type definition, this header is not needed and causes some circular dependency issues.	2020-03-07 06:07:18 +01:00
Willy Tarreau	c7f64e7a58	BUILD: freq_ctr: proto/freq_ctr needs to include common/standard.h This is needed for div_64_32() which is there and currently accidently inherited via global.h!	2020-03-07 06:07:18 +01:00
Willy Tarreau	f23e029409	BUILD: global: must not include common/standard.h but only types/freq_ctr.h This one was accidently inherited and used to work but causes a circular dependency.	2020-03-07 06:07:18 +01:00
Willy Tarreau	8dd0d55efe	BUILD: ssl: include mini-clist.h We use some list definitions and we don't include this header which is in fact accidently inherited from others, causing a circular dependency issue.	2020-03-07 06:07:18 +01:00
Willy Tarreau	a8561db936	BUILD: buffer: types/{ring.h,checks.h} should include buf.h, not buffer.h buffer.h relies on proto/activity because it contains some code and not just type definitions. It must not be included from types files. It should probably also be split in two if it starts to include a proto. This causes some circular dependencies at other places.	2020-03-07 06:07:18 +01:00
Christopher Faulet	d8f0e073dd	MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY This flag was used in some internal functions to be sure the current stream is able to handle HTTP content. It was introduced when the legacy HTTP code was still there. Now, It is possible to rely on stream's flags to be sure we have an HTX stream. So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is replaced by a call to the IS_HTX_STRM() macro. This patch is mandatory to allow the support of the filters written in lua.	2020-03-06 14:13:00 +01:00
Christopher Faulet	1cdceb9365	MINOR: htx: Add a function to return a block at a specific offset The htx_find_offset() function may be used to look for a block at a specific offset in an HTX message, starting from the message head. A compound result is returned, an htx_ret structure, with the found block and the position of the offset in the block. If the offset is ouside of the HTX message, the returned block is NULL.	2020-03-06 14:12:59 +01:00
Christopher Faulet	251f4917c3	MINOR: buf: Add function to insert a string at an absolute offset in a buffer The b_insert_blk() function may now be used to insert a string, given a pointer and the string length, at an absolute offset in a buffer, moving data between this offset and the buffer's tail just after the end of the inserted string. The buffer's length is automatically updated. This function supports wrapping. All the string is copied or nothing. So it returns 0 if there are not enough space to perform the copy. Otherwise, the number of bytes copied is returned.	2020-03-06 14:12:59 +01:00
Carl Henrik Lunde	f91ac19299	OPTIM: startup: fast unique_id allocation for acl. pattern_finalize_config() uses an inefficient algorithm which is a problem with very large configuration files. This affects startup, and therefore reload time. When haproxy is deployed as a router in a Kubernetes cluster the generated configuration file may be large and reloads are frequently occuring, which makes this a significant issue. The old algorithm is O(n^2) * allocate missing uids - O(n^2) * sort linked list - O(n^2) The new algorithm is O(n log n): * find the user allocated uids - O(n) * store them for efficient lookup - O(n log n) * allocate missing uids - n times O(log n) * sort all uids - O(n log n) * convert back to linked list - O(n) Performance examples, startup time in seconds: pat_refs old new 1000 0.02 0.01 10000 2.1 0.04 20000 12.3 0.07 30000 27.9 0.10 40000 52.5 0.14 50000 77.5 0.17 Please backport to 1.8, 2.0 and 2.1.	2020-03-06 08:11:58 +01:00
Tim Duesterhus	a17e66289c	MEDIUM: stream: Make the `unique_id` member of `struct stream` a `struct ist` The `unique_id` member of `struct stream` now is a `struct ist`.	2020-03-05 20:21:58 +01:00
Tim Duesterhus	0643b0e7e6	MINOR: proxy: Make `header_unique_id` a `struct ist` The `header_unique_id` member of `struct proxy` now is a `struct ist`.	2020-03-05 19:58:22 +01:00
Tim Duesterhus	9576ab7640	MINOR: ist: Add `struct ist istdup(const struct ist)` istdup() performs the equivalent of strdup() on a `struct ist`.	2020-03-05 19:53:12 +01:00
Tim Duesterhus	35005d01d2	MINOR: ist: Add `struct ist istalloc(size_t)` and `void istfree(struct ist*)` `istalloc` allocates memory and returns an `ist` with the size `0` that points to this allocation. `istfree` frees the pointed memory and clears the pointer.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	e296d3e5f0	MINOR: ist: Add `int isttest(const struct ist)` `isttest` returns whether the `.ptr` is non-null.	2020-03-05 19:52:07 +01:00
Tim Duesterhus	241e29ef9c	MINOR: ist: Add `IST_NULL` macro `IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and `.len = 0`.	2020-03-05 19:52:07 +01:00
William Lallemand	cfca1422c7	MINOR: ssl: reach a ckch_store from a sni_ctx It was only possible to go down from the ckch_store to the sni_ctx but not to go up from the sni_ctx to the ckch_store. To allow that, 2 pointers were added: - a ckch_inst pointer in the struct sni_ctx - a ckckh_store pointer in the struct ckch_inst	2020-03-05 11:28:42 +01:00
William Lallemand	38df1c8006	MINOR: ssl/cli: support crt-list filters Generate a list of the previous filters when updating a certificate which use filters in crt-list. Then pass this list to the function generating the sni_ctx during the commit. This feature allows the update of the crt-list certificates which uses the filters with "set ssl cert". This function could be probably replaced by creating a new ckch_inst_new_load_store() function which take the previous sni_ctx list as an argument instead of the char **sni_filter, avoiding the allocation/copy during runtime for each filter. But since are still handling the multi-cert bundles, it's better this way to avoid code duplication.	2020-03-05 11:27:53 +01:00
Tim Duesterhus	127a74dd48	MINOR: stream: Add stream_generate_unique_id function Currently unique IDs for a stream are generated using repetitive code in multiple locations, possibly allowing for inconsistent behavior.	2020-03-05 07:23:00 +01:00
Willy Tarreau	899e5f69a1	MINOR: debug: use our own backtrace function on clang+x86_64 A test on FreeBSD with clang 4 to 8 produces this on a call to a spinning loop on the CLI: call trace(5): \| 0x53e2bc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c \| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e with our own function it correctly produces this: call trace(20): \| 0x53e2dc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c \| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e \| 0x800e022bf [48 83 c4 38 5b 41 5c 41]: libthr:pthread_getspecific+0xdef \| 0x7ffffffff003 [48 8d 7c 24 10 6a 00 48]: main+0x7fffffb416f3 \| 0x801373809 [85 c0 0f 84 6f ff ff ff]: libc:__sys_gettimeofday+0x199 \| 0x801373709 [89 c3 85 c0 75 a6 48 8b]: libc:__sys_gettimeofday+0x99 \| 0x801371c62 [83 f8 4e 75 0f 48 89 df]: libc:gettimeofday+0x12 \| 0x51fa0a [48 89 df 4c 89 f6 e8 6b]: ha_thread_dump_all_to_trash+0x49a \| 0x4b723b [85 c0 75 09 49 8b 04 24]: mworker_cli_sockpair_new+0xd9b \| 0x4b6c68 [85 c0 75 08 4c 89 ef e8]: mworker_cli_sockpair_new+0x7c8 \| 0x532f81 [4c 89 e7 48 83 ef 80 41]: task_run_applet+0xe1 So let's add clang+x86_64 to the list of platforms that will use our simplified version. As a bonus it will not require to link with -lexecinfo on FreeBSD and will work out of the box when passing USE_BACKTRACE=1.	2020-03-04 12:04:07 +01:00
Willy Tarreau	13faf16e1e	MINOR: debug: improve backtrace() on aarch64 and possibly other systems It happens that on aarch64 backtrace() only returns one entry (tested with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding the stack due to the risk of hitting a bad pointer. Here we can use may_access() to know when it's safe, so we can actually unwind the stack without taking risks. It happens that the faulting function (the one just after the signal handler) is not listed here, very likely because the signal handler uses a special stack and did not create a new frame. So this patch creates a new my_backtrace() function in standard.h that either calls backtrace() or does its own unrolling. The choice depends on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build target.	2020-03-04 12:04:07 +01:00
Emmanuel Hocdet	842e94ee06	MINOR: ssl: add "ca-verify-file" directive It's only available for bind line. "ca-verify-file" allows to separate CA certificates from "ca-file". CA names sent in server hello message is only compute from "ca-file". Typically, "ca-file" must be defined with intermediate certificates and "ca-verify-file" with certificates to ending the chain, like root CA. Fix issue #404.	2020-03-04 11:53:11 +01:00
Willy Tarreau	eb8b1ca3eb	MINOR: tools: add resolve_sym_name() to resolve function pointers We use various hacks at a few places to try to identify known function pointers in debugging outputs (show threads & show fd). Let's centralize this into a new function dedicated to this. It already knows about the functions matched by "show threads" and "show fd", and when built with USE_DL, it can rely on dladdr1() to resolve other functions. There are some limitations, as static functions are not resolved, linking with -rdynamic is mandatory, and even then some functions will not necessarily appear. It's possible to do a better job by rebuilding the whole symbol table from the ELF headers in memory but it's less portable and the gains are still limited, so this solution remains a reasonable tradeoff.	2020-03-03 18:18:40 +01:00
Willy Tarreau	762fb3ec8e	MINOR: tools: add new function dump_addr_and_bytes() This function dumps <n> bytes from <addr> in hex form into buffer <buf> enclosed in brackets after the address itself, formatted on 14 chars including the "0x" prefix. This is meant to be used as a prefix for code areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: " It relies on may_access() to know if the bytes are dumpable, otherwise "--" is emitted. An optional prefix is supported.	2020-03-03 17:46:37 +01:00
Willy Tarreau	27d00c0167	MINOR: task: export run_tasks_from_list This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	3ebd55ee51	MINOR: haproxy: export run_poll_loop This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1827845a3d	MINOR: haproxy: export main to ease access from debugger Better just export main instead of declaring it as extern, it's cleaner and may be usable elsewhere.	2020-03-03 15:26:10 +01:00
Willy Tarreau	1ed3781e21	MINOR: fd: merge the read and write error bits into RW error We always set them both, which makes sense since errors at the FD level indicate a terminal condition for the socket that cannot be recovered. Usually this is detected via a write error, but sometimes such an error may asynchronously be reported on the read side. Let's simplify this using only the write bit and calling it RW since it's used like this everywhere, and leave the R bit spare for future use.	2020-02-28 07:42:29 +01:00
Willy Tarreau	a135ea63a6	CLEANUP: fd: remove some unneeded definitions of FD_EV_* flags There's no point in trying to be too generic for these flags as the read and write sides will soon differ a bit. Better explicitly define the flags for each direction without trying to be direction-agnostic. this clarifies the code and removes some defines.	2020-02-28 07:42:29 +01:00
Willy Tarreau	f80fe832b1	CLEANUP: fd: remove the FD_EV_STATUS aggregate This was used only by fd_recv_state() and fd_send_state(), both of which are unused. This will not work anymore once recv and send flags start to differ, so let's remove this.	2020-02-28 07:42:29 +01:00
Jerome Magnin	967d3cc105	BUG/MINOR: http_ana: make sure redirect flags don't have overlapping bits commit `c87e46881` ("MINOR: http-rules: Add a flag on redirect rules to know the rule direction") introduced a new flag for redirect rules, but its value has bits in common with REDIRECT_FLAG_DROP_QS, which makes us enter this code path in http_apply_redirect_rule(), which will then drop the query string. To fix this, just give REDIRECT_FLAG_FROM_REQ its own unique value. This must be backported where `c87e468816` is backported. This should fix issue 521.	2020-02-27 23:44:41 +01:00
Willy Tarreau	2104659cd5	MEDIUM: buffer: remove the buffer_wq lock This lock was only needed to protect the buffer_wq list, but now we have the mt_list for this. This patch simply turns the buffer_wq list to an mt_list and gets rid of the lock. It's worth noting that the whole buffer_wait thing still looks totally wrong especially in a threaded context: the wakeup_cb() callback is called synchronously from any thread and may end up calling some connection code that was not expected to run on a given thread. The whole thing should probably be reworked to use tasklets instead and be a bit more centralized.	2020-02-26 10:39:36 +01:00
William Lallemand	e0f3fd5b4c	CLEANUP: ssl: move issuer_chain tree and definition Move the cert_issuer_tree outside the global_ssl structure since it's not a configuration variable. And move the declaration of the issuer_chain structure in types/ssl_sock.h	2020-02-25 15:06:40 +01:00
Willy Tarreau	226ef26056	MINOR: compiler: add new alignment macros This commit adds ALWAYS_ALIGN(), MAYBE_ALIGN() and ATOMIC_ALIGN() to be placed as delimitors inside structures to force alignment to a given size. These depend on the architecture's capabilities so that it is possible to always align, align only on archs not supporting unaligned accesses at all, or only on those not supporting them for atomic accesses (e.g. before a lock).	2020-02-25 10:34:43 +01:00
Willy Tarreau	908071171b	BUILD: general: always pass unsigned chars to is* functions The isalnum(), isalpha(), isdigit() etc functions from ctype.h are supposed to take an int in argument which must either reflect an unsigned char or EOF. In practice on some platforms they're implemented as macros referencing an array, and when passed a char, they either cause a warning "array subscript has type 'char'" when lucky, or cause random segfaults when unlucky. It's quite unconvenient by the way since none of them may return true for negative values. The recent introduction of cygwin to the list of regularly tested build platforms revealed a lot of breakage there due to the same issues again. So this patch addresses the problem all over the code at once. It adds unsigned char casts to every valid use case, and also drops the unneeded double cast to int that was sometimes added on top of it. It may be backported by dropping irrelevant changes if that helps better support uncommon platforms. It's unlikely to fix bugs on platforms which would already not emit any warning though.	2020-02-25 08:16:33 +01:00
Willy Tarreau	03e7853581	BUILD: remove obsolete support for -mregparm / USE_REGPARM This used to be a minor optimization on ix86 where registers are scarce and the calling convention not very efficient, but this platform is not relevant enough anymore to warrant all this dirt in the code for the sake of saving 1 or 2% of performance. Modern platforms don't use this at all since their calling convention already defaults to using several registers so better get rid of this once for all.	2020-02-25 07:41:47 +01:00
Tim Duesterhus	1d48ba91d7	CLEANUP: net_helper: Do not negate the result of unlikely This patch turns the double negation of 'not unlikely' into 'likely' and then turns the negation of 'not smaller' into 'greater or equal' in an attempt to improve readability of the condition. [wt: this was not a bug but purposely written like this to improve code generation on older compilers but not needed anymore as described here: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]	2020-02-25 07:30:49 +01:00
Tim Duesterhus	927063b892	CLEANUP: conn: Do not pass a pointer to likely Move the `!` inside the likely and negate it to unlikely. The previous version should not have caused issues, because it is converted to a boolean / integral value before being passed to __builtin_expect(), but it's certainly unusual. [wt: this was not a bug but purposely written like this to improve code generation on older compilers but not needed anymore as described here: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]	2020-02-25 07:30:49 +01:00
Willy Tarreau	89ee79845c	MINOR: compiler: drop special cases of likely/unlikely for older compilers We used to special-case the likely()/unlikely() macros for a series of early gcc 4.x compilers which used to produce very bad code when using __builtin_expect(x,1), which basically used to build an integer (0 or 1) from a condition then compare it to integer 1. This was already fixed in 5.x, but even now, looking at the code produced by various flavors of 4.x this bad behavior couldn't be witnessed anymore. So let's consider it as fixed by now, which will allow to get rid of some ugly tricks at some specific places. A test on 4.7.4 shows that the code shrinks by about 3kB now, thanks to some tests being inlined closer to the call place and the unlikely case being moved to real functions. See the link below for more background on this. Link: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html	2020-02-25 07:29:55 +01:00
Willy Tarreau	0e2686762f	MINOR: compiler: move CPU capabilities definition from config.h and complete them These ones are irrelevant to the config but rather to the platform, and as such are better placed in compiler.h. Here we take the opportunity for declaring a few extra capabilities: - HA_UNALIGNED : CPU supports unaligned accesses - HA_UNALIGNED_LE : CPU supports unaligned accesses in little endian - HA_UNALIGNED_FAST : CPU supports fast unaligned accesses - HA_UNALIGNED_ATOMIC : CPU supports unaligned accesses in atomics This will help remove a number of #ifdefs with arch-specific statements.	2020-02-21 16:32:57 +01:00
Jerome Magnin	9dde0b2d31	MINOR: ist: add an iststop() function Add a function that finds a character in an ist and returns an updated ist with the length of the portion of the original string that doesn't contain the char. Might be backported to 2.1	2020-02-21 11:47:25 +01:00
Willy Tarreau	716bec2dc6	MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE This flag is currently supported by raw_sock to perform a single recv() attempt and avoid subscribing. Typically on the request and response paths with keep-alive, with short messages we know that it's very likely that the first message is enough.	2020-02-21 11:22:45 +01:00
Willy Tarreau	5d4d1806db	CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv} This marks the end of the transition from the connection polling states introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9. The socket layer can now safely use its FD while all upper layers rely exclusively on subscriptions. These old functions were removed. Some may deserve some renaming to improved clarty though. The single call to conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling() which already does the same.	2020-02-21 11:21:12 +01:00
Willy Tarreau	d1d14c3157	MINOR: connection: remove the last calls to conn_xprt_{want,stop}_* The last few calls to conn_xprt_{want,stop}_{recv,send} in the central connection code were replaced with their strictly exact equivalent fd_*, adding the call to conn_ctrl_ready() when it was missing.	2020-02-21 11:21:12 +01:00
Willy Tarreau	19bc201c9f	MEDIUM: connection: remove the intermediary polling state from the connection Historically we used to require that the connections held the desired polling states for the data layer and the socket layer. Then with muxes these were more or less merged into the transport layer, and now it happens that with all transport layers having their own state, the "transport layer state" as we have it in the connection (XPRT_RD_ENA, XPRT_WR_ENA) is only an exact copy of the undelying file descriptor state, but with a delay. All of this is causing some difficulties at many places in the code because there are still some locations which use the conn_want_* API to remain clean and only rely on connection, and count on a later collection call to conn_cond_update_polling(), while others need an immediate action and directly use the FD updates. Since our updates are now much cheaper, most of them being only an atomic test-and-set operation, and since our I/O callbacks are deferred, there's no benefit anymore in trying to "cache" the transient state change in the connection flags hoping to cancel them before they become an FD event. Better make such calls transparent indirections to the FD layer instead and get rid of the deferred operations which needlessly complicate the logic inside. This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE. A number of functions related to polling updates were either greatly simplified or removed. Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data were expected to be sent after a PROXY protocol or SOCKSv4 header. These ones were simply replaced with a check on the subscription which is where we ought to get the autoritative information from. Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts are the same. conn_stop_polling() and conn_xprt_stop_both() are the same as well. conn_cond_update_polling() only causes errors to stop polling. It also becomes way more obvious that muxes should not at all employ conn_xprt_{want\|stop}_{recv,send}(), and that the call to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer is inappropriate, it ought to unsubscribe from reads instead. All of this definitely requires a serious cleanup.	2020-02-21 11:21:12 +01:00
Christopher Faulet	727a3f1ca3	MINOR: http-htx: Add a function to retrieve the headers size of an HTX message http_get_hdrs_size() function may now be used to get the bytes held by headers in an HTX message. It only works if the headers were not already forwarded. Metadata are not counted here.	2020-02-18 11:19:57 +01:00
Willy Tarreau	a71667c07d	BUG/MINOR: tools: also accept '+' as a valid character in an identifier The function is_idchar() was added by commit `36f586b` ("MINOR: tools: add is_idchar() to tell if a char may belong to an identifier") to ease matching of sample fetch/converter names. But it lacked support for the '+' character used in "base32+src" and "url32+src". A quick way to figure the list of supported sample fetch+converter names is to issue the following command: git grep '"[^"]",.SMP_T_.*SMP_USE_'\|cut -f2 -d'"'\|sort -u No more entry is reported once searching for characters not covered by is_idchar(). No backport is needed.	2020-02-17 06:37:40 +01:00
Willy Tarreau	e3b57bf92f	MINOR: sample: make sample_parse_expr() able to return an end pointer When an end pointer is passed, instead of complaining that a comma is missing after a keyword, sample_parse_expr() will silently return the pointer to the current location into this return pointer so that the caller can continue its parsing. This will be used by more complex expressions which embed sample expressions, and may even permit to embed sample expressions into arguments of other expressions.	2020-02-14 19:02:06 +01:00
Willy Tarreau	80b53ffb1c	MEDIUM: arg: make make_arg_list() stop after its own arguments The main problem we're having with argument parsing is that at the moment the caller looks for the first character looking like an end of arguments (')') and calls make_arg_list() on the sub-string inside the parenthesis. Let's first change the way it works so that make_arg_list() also consumes the parenthesis and returns the pointer to the first char not consumed. This will later permit to refine each argument parsing. For now there is no functional change.	2020-02-14 19:02:06 +01:00
Willy Tarreau	d4ad669051	MINOR: chunk: implement chunk_strncpy() to copy partial strings This does like chunk_strcpy() except that the maximum string length may be limited by the caller. A trailing zero is always appended. This is particularly handy to extract portions of strings to put into the trash for use with libc functions requiring a nul-terminated string.	2020-02-14 19:02:06 +01:00
Willy Tarreau	36f586b694	MINOR: tools: add is_idchar() to tell if a char may belong to an identifier This function will simply be used to find the end of config identifiers (proxies, servers, ACLs, sample fetches, converters, etc).	2020-02-14 19:02:06 +01:00
Ilya Shipitsin	88a2f0304c	CLEANUP: ssl: remove unused functions in openssl-compat.h functions SSL_SESSION_get0_id_context, SSL_CTX_get_default_passwd_cb, SSL_CTX_get_default_passwd_cb_userdata are not used anymore	2020-02-14 16:15:00 +01:00
Willy Tarreau	160ad9e38a	CLEANUP: mini-clist: simplify nested do { while(1) {} } while (0) While looking for other occurrences of do { continue; } while (0) I found these few leftovers in mini-clist where an outer loop was made around "do { } while (0)" then another loop was placed inside just to handle the continue. Let's clean this up by just removing the outer one. Most of the patch is only the inner part of the loop that is reindented. It was verified that the resulting code is the same.	2020-02-11 10:27:04 +01:00
Christopher Faulet	7716cdf450	MINOR: lua: Get the action return code on the stack when an action finishes When an action successfully finishes, the action return code (ACT_RET_*) is now retrieve on the stack, ff the first element is an integer. In addition, in hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before exiting. Thus, when a script uses this function, the corresponding action still finishes with the good code. Thanks to this change, the flag HLUA_STOP is now useless. So it has been removed. It is a mandatory step to allow a lua action to return any action return code.	2020-02-06 15:13:03 +01:00
Christopher Faulet	07a718e712	CLEANUP: lua: Remove consistency check for sample fetches and actions It is not possible anymore to alter the HTTP parser state from lua sample fetches or lua actions. So there is no reason to still check for the parser state consistency.	2020-02-06 15:13:03 +01:00
Christopher Faulet	4a2c142779	MEDIUM: http-rules: Support extra headers for HTTP return actions It is now possible to append extra headers to the generated responses by HTTP return actions, while it is not based on an errorfile. For return actions based on errorfiles, these extra headers are ignored. To define an extra header, a "hdr" argument must be used with a name and a value. The value is a log-format string. For instance: http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"	2020-02-06 15:13:03 +01:00
Christopher Faulet	24231ab61f	MEDIUM: http-rules: Add the return action to HTTP rules Thanks to this new action, it is now possible to return any responses from HAProxy, with any status code, based on an errorfile, a file or a string. Unlike the other internal messages generated by HAProxy, these ones are not interpreted as errors. And it is not necessary to use a file containing a full HTTP response, although it is still possible. In addition, using a log-format string or a log-format file, it is possible to have responses with a dynamic content. This action can be used on the request path or the response path. The only constraint is to have a responses smaller than a buffer. And to avoid any warning the buffer space reserved to the headers rewritting should also be free. When a response is returned with a file or a string as payload, it only contains the content-length header and the content-type header, if applicable. Here are examples: http-request return content-type image/x-icon file /var/www/favicon.ico \ if { path /favicon.ico } http-request return status 403 content-type text/plain \ lf-string "Access denied. IP %[src] is blacklisted." \ if { src -f /etc/haproxy/blacklist.lst }	2020-02-06 15:12:54 +01:00
Christopher Faulet	6d0c3dfac6	MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding This patch introduces the 'http-after-response' rules. These rules are evaluated at the end of the response analysis, just before the data forwarding, on ALL HTTP responses, the server ones but also all responses generated by HAProxy. Thanks to this ruleset, it is now possible for instance to add some headers to the responses generated by the stats applet. Following actions are supported : * allow * add-header * del-header * replace-header * replace-value * set-header * set-status * set-var * strict-mode * unset-var	2020-02-06 14:55:34 +01:00
Christopher Faulet	ef70e25035	MINOR: http-ana: Add a function for forward internal responses Operations performed when internal responses (redirect/deny/auth/errors) are returned are always the same. The http_forward_proxy_resp() function is added to group all of them under a unique function.	2020-02-06 14:55:34 +01:00
Christopher Faulet	72c7d8d040	MINOR: http-ana: Rely on http_reply_and_close() to handle server error The http_server_error() function now relies on http_reply_and_close(). Both do almost the same actions. In addtion, http_server_error() sets the error flag and the final state flag on the stream.	2020-02-06 14:55:34 +01:00
Christopher Faulet	c87e468816	MINOR: http-rules: Add a flag on redirect rules to know the rule direction HTTP redirect rules can be evaluated on the request or the response path. So when a redirect rule is evaluated, it is important to have this information because some specific processing may be performed depending on the direction. So the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the redirect rule during the parsing. This patch is mandatory to fix a bug on redirect rule. It must be backported to all stable versions.	2020-02-06 14:55:34 +01:00
Christopher Faulet	a4168434a7	MINOR: dns: Dynamically allocate dns options to reduce the act_rule size <.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated when a do-resolve rule is parsed. This drastically reduces the structure size.	2020-02-06 14:55:34 +01:00
Christopher Faulet	7651362e52	MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer The channel_htx_copy_msg() function can now be used to copy an HTX message in a channel's buffer. This function takes care to not overwrite existing data. This patch depends on the commit "MINOR: htx: Add a function to append an HTX message to another one". Both are mandatory to fix a bug in http_reply_and_close() function. Be careful to backport both first.	2020-02-06 14:55:16 +01:00
Christopher Faulet	0ea0c86753	MINOR: htx: Add a function to append an HTX message to another one the htx_append_msg() function can now be used to append an HTX message to another one. All the message is copied or nothing. If an error occurs during the copy, all changes are rolled back. This patch is mandatory to fix a bug in http_reply_and_close() function. Be careful to backport it first.	2020-02-06 14:54:47 +01:00
Olivier Houchard	1c7c0d6b97	BUG/MAJOR: memory: Don't forget to unlock the rwlock if the pool is empty. In __pool_get_first(), don't forget to unlock the pool lock if the pool is empty, otherwise no writer will be able to take the lock, and as it is done when reloading, it leads to an infinite loop on reload. This should be backported with commit `04f5fe87d3`	2020-02-03 13:05:31 +01:00
Olivier Houchard	04f5fe87d3	BUG/MEDIUM: memory: Add a rwlock before freeing memory. When using lockless pools, add a new rwlock, flush_pool. read-lock it when getting memory from the pool, so that concurrenct access are still authorized, but write-lock it when we're about to free memory, in pool_flush() and pool_gc(). The problem is, when removing an item from the pool, we unreference it to get the next one, however, that pointer may have been free'd in the meanwhile, and that could provoke a crash if the pointer has been unmapped. It should be OK to use a rwlock, as normal operations will still be able to access the pool concurrently, and calls to pool_flush() and pool_gc() should be pretty rare. This should be backported to 2.1, 2.0 and 1.9.	2020-02-01 18:08:34 +01:00
Willy Tarreau	b30a153cd1	MINOR: task: detect self-wakeups on tl==sched->current instead of TASK_RUNNING This is exactly what we want to detect (a task/tasklet waking itself), so let's use the proper condition for this.	2020-01-31 17:45:10 +01:00
Willy Tarreau	bb238834da	MINOR: task: permanently flag tasklets waking themselves up Commit `a17664d829` ("MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet") tried to inflict a penalty to self-requeuing tasks/tasklets which correspond to those involved in large, high-latency data transfers, for the benefit of all other processing which requires a low latency. However, it turns out that while it ought to do this on a case-by-case basis, basing itself on the RUNNING flag isn't accurate because this flag doesn't leave for tasklets, so we'd rather need a distinct flag to tag such tasklets. This commit introduces TASK_SELF_WAKING to mark tasklets acting like this. For now it's still set when TASK_RUNNING is present but this will have to change. The flag is kept across wakeups.	2020-01-31 17:45:10 +01:00
Willy Tarreau	a17664d829	MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet When a tasklet re-runs itself such as in this chain: si_cs_io_cb -> si_cs_process -> si_notify -> si_chk_rcv then we know it can easily clobber the run queue and harm latency. Now what the scheduler does when it detects this is that such a tasklet is automatically placed into the bulk list so that it's processed with the remaining CPU bandwidth only. Thanks to this the CLI becomes instantly responsive again even under heavy stress at 50 Gbps over 40kcon and 100% CPU on 16 threads.	2020-01-30 19:03:31 +01:00
Willy Tarreau	a62917b890	MEDIUM: tasks: implement 3 different tasklet classes with their own queues We used to mix high latency tasks and low latency tasklets in the same list, and to even refill bulk tasklets there, causing some unfairness in certain situations (e.g. poll-less transfers between many connections saturating the machine with similarly-sized in and out network interfaces). This patch changes the mechanism to split the load into 3 lists depending on the task/tasklet's desired classes : - URGENT: this is mainly for tasklets used as deferred callbacks - NORMAL: this is for regular tasks - BULK: this is for bulk tasks/tasklets Arbitrary ratios of max_processed are picked from each of these lists in turn, with the ability to complete in one list from what was not picked in the previous one. After some quick tests, the following setup gave apparently good results both for raw TCP with splicing and for H2-to-H1 request rate: - 0 to 75% for urgent - 12 to 50% for normal - 12 to what remains for bulk Bulk is not used yet.	2020-01-30 18:59:33 +01:00
Willy Tarreau	911db9bd29	MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE As mentioned in commit `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of consistency on which flags are checked among L4/L6/HANDSHAKE depending on the code areas. A number of sample fetch functions only check for L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and many check both L4L6 and HANDSHAKE. This patch starts to make all of this more consistent by introducing a new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and reports whether the transport layer is ready or not. All inconsistent call places were updated to rely on this one each time the goal was to check for the readiness of the transport layer.	2020-01-23 16:34:26 +01:00
Willy Tarreau	4450b587dd	MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE Most places continue to check CO_FL_HANDSHAKE while in fact they should check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one dedicated to SSL renegotiation. In fact the SSL layer should be the only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data when a renegotiation is in progress, but other ones randomly include it without knowing. And ideally it should even be an internal flag that's not exposed in the connection. This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL. In order to limit the confusion that has accumulated over time, the CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake, possibly used by a renegotiation was moved after the other ones.	2020-01-23 16:34:26 +01:00
Willy Tarreau	c192b0ab95	MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_* Commit `477902bd2e` ("MEDIUM: connections: Get ride of the xprt_done callback.") broke the master CLI for a very obscure reason. It happens that short requests immediately terminated by a shutdown are properly received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not yet set on the connection since we've not passed again through conn_fd_handler() and it was not done in conn_complete_session(). While commit `a8a415d31a` ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session()") fixed the issue, such accident may happen again as the root cause is deeper and actually comes down to the fact that CO_FL_CONNECTED is lazily set at various check points in the code but not every time we drop one wait bit. It is not the first time we face this situation. Originally this flag was used to detect the transition between WAIT_* and CONNECTED in order to call ->wake() from the FD handler. But since at least 1.8-dev1 with commit `7bf3fa3c23` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is always synchronized against the two others before being checked. Moreover, with the I/Os moved to tasklets, the decision to call the ->wake() function is performed after the I/Os in si_cs_process() and equivalent, which don't care about this transition either. So in essence, checking for CO_FL_CONNECTED has become a lazy wait to check for (CO_FL_WAIT_L4_CONN \| CO_FL_WAIT_L6_CONN), but that always relies on someone else having synchronized it. This patch addresses it once for all by killing this flag and only checking the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This revealed a number of inconsistencies that were purposely not addressed here for the sake of bisectability: - while most places do check both L4+L6 and HANDSHAKE at the same time, some places like assign_server() or back_handle_st_con() and a few sample fetches looking for proxy protocol do check for L4+L6 but don't care about HANDSHAKE ; these ones will probably fail on TCP request session rules if the handshake is not complete. - some handshake handlers do validate that a connection is established at L4 but didn't clear CO_FL_WAIT_L4_CONN - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6 before declaring the mux ready while the snd_buf function also checks for the handshake's completion. Likely the former should validate the handshake as well and we should get rid of these extra tests in snd_buf. - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only later clear CO_FL_WAIT_L4_CONN. - xprt_handshake would set CO_FL_CONNECTED itself without actually clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if waiting for a pure Rx handshake. - most places in ssl_sock that were checking CO_FL_CONNECTED don't need to include the L4 check as an L6 check is enough to decide whether to wait for more info or not. It also becomes obvious when reading the test in si_cs_recv() that caused the failure mentioned above that once converted it doesn't make any sense anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete cannot happen since for CS_FL_EOS to be set, the other ones must have been validated. Some of these parts will still deserve further cleanup, and some of the observations above may induce some backports of potential bug fixes once totally analyzed in their context. The risk of breaking existing stuff is too high to blindly backport everything.	2020-01-23 14:41:37 +01:00
Olivier Houchard	477902bd2e	MEDIUM: connections: Get ride of the xprt_done callback. The xprt_done_cb callback was used to defer some connection initialization until we're connected and the handshake are done. As it mostly consists of creating the mux, instead of using the callback, introduce a conn_create_mux() function, that will just call conn_complete_session() for frontend, and create the mux for backend. In h2_wake(), make sure we call the wake method of the stream_interface, as we no longer wakeup the stream task.	2020-01-22 18:56:05 +01:00
Olivier Houchard	8af03b396a	MEDIUM: streams: Always create a conn_stream in connect_server(). In connect_server(), when creating a new connection for which we don't yet know the mux (because it'll be decided by the ALPN), instead of associating the connection to the stream_interface, always create a conn_stream. This way, we have less special-casing needed. Store the conn_stream in conn->ctx, so that we can reach the upper layers if needed.	2020-01-22 18:55:59 +01:00
Emmanuel Hocdet	6b5b44e10f	BUG/MINOR: ssl: ssl_sock_load_pem_into_ckch is not consistent "set ssl cert <filename> <payload>" CLI command should have the same result as reload HAproxy with the updated pem file (<filename>). Is not the case, DHparams/cert-chain is kept from the previous context if no DHparams/cert-chain is set in the context (<payload>). This patch should be backport to 2.1	2020-01-22 15:55:55 +01:00
Adis Nezirovic	1a693fc2fd	MEDIUM: cli: Allow multiple filter entries for "show table" For complex stick tables with many entries/columns, it can be beneficial to filter using multiple criteria. The maximum number of filter entries can be controlled by defining STKTABLE_FILTER_LEN during build time. This patch can be backported to older releases.	2020-01-22 14:33:17 +01:00
Ilya Shipitsin	056c629531	BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x while working on issue #429, I encountered build failures with various non-released openssl versions, let us improve ssl defines, switch to features, not versions, for EVP_CTRL_AEAD_SET_IVLEN and EVP_CTRL_AEAD_SET_TAG. No backport is needed as there is no valid reason to build a stable haproxy version against a development version of openssl.	2020-01-22 07:54:52 +01:00
Willy Tarreau	2086365f51	CLEANUP: pattern: remove the pat_time definition It was inherited from acl_time, introduced in 1.3.10 by commit `a84d374367` ("[MAJOR] new framework for generic ACL support") and was never ever used. Let's simply drop it now.	2020-01-22 07:44:36 +01:00
Tim Duesterhus	6a0dd73390	CLEANUP: Consistently `unsigned int` for bitfields Signed bitfields of size `1` hold the values `0` and `-1`, but are usually assigned `1`, possibly leading to subtle bugs when the value is explicitely compared against `1`.	2020-01-22 07:28:39 +01:00
Baptiste Assmann	13a9232ebc	MEDIUM: dns: use Additional records from SRV responses Most DNS servers provide A/AAAA records in the Additional section of a response, which correspond to the SRV records from the Answer section: ;; QUESTION SECTION: ;_http._tcp.be1.domain.tld. IN SRV ;; ANSWER SECTION: _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A1.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A8.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A5.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A6.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A4.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A3.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A2.domain.tld. _http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A7.domain.tld. ;; ADDITIONAL SECTION: A1.domain.tld. 3600 IN A 192.168.0.1 A8.domain.tld. 3600 IN A 192.168.0.8 A5.domain.tld. 3600 IN A 192.168.0.5 A6.domain.tld. 3600 IN A 192.168.0.6 A4.domain.tld. 3600 IN A 192.168.0.4 A3.domain.tld. 3600 IN A 192.168.0.3 A2.domain.tld. 3600 IN A 192.168.0.2 A7.domain.tld. 3600 IN A 192.168.0.7 SRV record support was introduced in HAProxy 1.8 and the first design did not take into account the records from the Additional section. Instead, a new resolution is associated to each server with its relevant FQDN. This behavior generates a lot of DNS requests (1 SRV + 1 per server associated). This patch aims at fixing this by: - when a DNS response is validated, we associate A/AAAA records to relevant SRV ones - set a flag on associated servers to prevent them from running a DNS resolution for said FADN - update server IP address with information found in the Additional section If no relevant record can be found in the Additional section, then HAProxy will failback to running a dedicated resolution for this server, as it used to do. This behavior is the one described in RFC 2782.	2020-01-22 07:19:54 +01:00
Christopher Faulet	2f5339079b	MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive It is now possible to insert any attribute when a cookie is inserted by HAProxy. Any value may be set, no check is performed except the syntax validity (CTRL chars and ';' are forbidden). For instance, it may be used to add the SameSite attribute: cookie SRV insert attr "SameSite=Strict" The attr option may be repeated to add several attributes. This patch should fix the issue #361.	2020-01-22 07:18:31 +01:00
Christopher Faulet	554c0ebffd	MEDIUM: http-rules: Support an optional error message in http deny rules It is now possible to set the error message to use when a deny rule is executed. It may be a specific error file, adding "errorfile <file>" : http-request deny deny_status 400 errorfile /etc/haproxy/errorfiles/400badreq.http It may also be an error file from an http-errors section, adding "errorfiles <name>" : http-request deny errorfiles my-errors # use 403 error from "my-errors" section When defined, this error message is set in the HTTP transaction. The tarpit rule is also concerned by this change.	2020-01-20 15:18:46 +01:00
Christopher Faulet	473e880a25	MINOR: http-ana: Add an error message in the txn and send it when defined It is now possible to set the error message to return to client in the HTTP transaction. If it is defined, this error message is used instead of proxy's errors or default errors.	2020-01-20 15:18:46 +01:00
Christopher Faulet	76edc0f29c	MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy It is now possible to import in a proxy, fully or partially, error files declared in an http-errors section. It may be done using the "errorfiles" directive, followed by a name and optionally a list of status code. If there is no status code specified, all error files of the http-errors section are imported. Otherwise, only error files associated to the listed status code are imported. For instance : http-errors my-errors errorfile 400 ... errorfile 403 ... errorfile 404 ... frontend frt errorfiles my-errors 403 404 # ==> error 400 not imported	2020-01-20 15:18:46 +01:00
Christopher Faulet	35cd81d363	MINOR: http-htx: Add a new section to create groups of custom HTTP errors A new section may now be declared in the configuration to create global groups of HTTP errors. These groups are not linked to a proxy and are referenced by name. The section must be declared using the keyword "http-errors" followed by the group name. This name must be unique. A list of "errorfile" directives may be declared in such section. For instance: http-errors website-1 errorfile 400 /path/to/site1/400.http errorfile 404 /path/to/site1/404.http http-errors website-2 errorfile 400 /path/to/site2/400.http errorfile 404 /path/to/site2/404.http For now, it is just possible to create "http-errors" sections. There is no documentation because these groups are not used yet.	2020-01-20 15:18:46 +01:00
Christopher Faulet	5885775de1	MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages All custom HTTP errors are now stored in a global tree. Proxies use a references on these messages. The key used for errorfile directives is the file name as specified in the configuration. For errorloc directives, a key is created using the redirect code and the url. This means that the same custom error message is now stored only once. It may be used in several proxies or for several status code, it is only parsed and stored once.	2020-01-20 15:18:46 +01:00
Christopher Faulet	bdf6526e94	MINOR: http-htx: Add functions to create HTX redirect message http_parse_errorloc() may now be used to create an HTTP 302 or 303 redirect message with a specific url passed as parameter. A parameter is used to known if it is a 302 or a 303 redirect. A status code is passed as parameter. It must be one of the supported HTTP error codes to be valid. Otherwise an error is returned. It aims to be used to parse "errorloc" directives. It relies on http_load_errormsg() to do most of the job, ie converting it in HTX.	2020-01-20 15:18:45 +01:00
Christopher Faulet	5031ef58ca	MINOR: http-htx: Add functions to read a raw error file and convert it in HTX http_parse_errorfile() may now be used to parse a raw HTTP message from a file. A status code is passed as parameter. It must be one of the supported HTTP error codes to be valid. Otherwise an error is returned. It aims to be used to parse "errorfile" directives. It relies on http_load_errorfile() to do most of the job, ie reading the file content and converting it in HTX.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d73b96d48c	MINOR: tcp-rules: Make tcp-request capture a custom action Now, this action is use its own dedicated function and is no longer handled "in place" during the TCP rules evaluation. Thus the action name ACT_TCP_CAPTURE is removed. The action type is set to ACT_CUSTOM and a check function is used to know if the rule depends on request contents while there is no inspect-delay.	2020-01-20 15:18:45 +01:00
Christopher Faulet	ac98d81f46	MINOR: http-rule/tcp-rules: Make track-sc* custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the TCP/HTTP rules evaluation. Thus the action names ACT_ACTION_TRK_SC0 and ACT_ACTION_TRK_SCMAX are removed. The action type is now the tracking index. Thus the function trk_idx() is no longer needed.	2020-01-20 15:18:45 +01:00
Christopher Faulet	91b3ec13c6	MEDIUM: http-rules: Make early-hint custom actions Now, the early-hint action uses its own dedicated action and is no longer handled "in place" during the HTTP rules evaluation. Thus the action name ACT_HTTP_EARLY_HINT is removed. In additionn, http_add_early_hint_header() and http_reply_103_early_hints() are also removed. This part is now handled in the new action_ptr callback function.	2020-01-20 15:18:45 +01:00
Christopher Faulet	046cf44f6c	MINOR: http-rules: Make set/del-map and add/del-acl custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP__ACL and ACT_HTTP__MAP are removed. The action type is now mapped as following: 0 = add-acl, 1 = set-map, 2 = del-acl and 3 = del-map.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d1f27e3394	MINOR: http-rules: Make set-header and add-header custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_SET_HDR and ACT_HTTP_ADD_VAL are removed. The action type is now set to 0 to set a header (so remove existing ones if any and add a new one) or to 1 to add a header (add without remove).	2020-01-20 15:18:45 +01:00
Christopher Faulet	92d34fe38d	MINOR: http-rules: Make replace-header and replace-value custom actions Now, these actions use their own dedicated function and are no longer handled "in place" during the HTTP rules evaluation. Thus the action names ACT_HTTP_REPLACE_HDR and ACT_HTTP_REPLACE_VAL are removed. The action type is now set to 0 to evaluate the whole header or to 1 to evaluate every comma-delimited values. The function http_transform_header_str() is renamed to http_replace_hdrs() to be more explicit and the function http_transform_header() is removed. In fact, this last one is now more or less the new action function. The lua code has been updated accordingly to use http_replace_hdrs().	2020-01-20 15:18:45 +01:00
Christopher Faulet	006f6507d7	MINOR: actions: Use an integer to set the action type <action> field in the act_rule structure is now an integer. The act_name values are used for all actions without action function (but it is not a pre-requisit though) or the action will have no effect. But for all other actions, any integer value may used, only the action function will take care of it. The default for such actions is ACT_CUSTOM.	2020-01-20 15:18:45 +01:00
Christopher Faulet	245cf795c1	MINOR: actions: Add flags to configure the action behaviour Some flags can now be set on an action when it is registered. The flags are defined in the act_flag enum. For now, only ACT_FLAG_FINAL may be set on an action to specify if it stops the rules evaluation. It is set on ACT_ACTION_ALLOW, ACT_ACTION_DENY, ACT_HTTP_REQ_TARPIT, ACT_HTTP_REQ_AUTH, ACT_HTTP_REDIR and ACT_TCP_CLOSE actions. But, when required, it may also be set on custom actions. Consequently, this flag is checked instead of the action type during the configuration parsing to trigger a warning when a rule inhibits all the following ones.	2020-01-20 15:18:45 +01:00
Christopher Faulet	105ba6cc54	MINOR: actions: Rename the act_flag enum into act_opt The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the action flags for the actions configuration.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cd26e8a2ec	MINOR: http-rules/tcp-rules: Call the defined action function first if defined When TCP and HTTP rules are evaluated, if an action function (action_ptr field in the act_rule structure) is defined for a given action, it is now always called in priority over the test on the action type. Concretly, for now, only custom actions define it. Thus there is no change. It just let us the choice to extend the action type beyond the existing ones in the enum.	2020-01-20 15:18:45 +01:00
Christopher Faulet	96bff76087	MINOR: actions: Regroup some info about HTTP rules in the same struct Info used by HTTP rules manipulating the message itself are splitted in several structures in the arg union. But it is possible to group all of them in a unique struct. Now, <arg.http> is used by most of these rules, which contains: * <arg.http.i> : an integer used as status code, nice/tos/mark/loglevel or action id. * <arg.http.str> : an IST used as header name, reason string or auth realm. * <arg.http.fmt> : a log-format compatible expression * <arg.http.re> : a regular expression used by replace rules	2020-01-20 15:18:45 +01:00
Christopher Faulet	58b3564fde	MINOR: actions: Add a function pointer to release args used by actions Arguments used by actions are never released during HAProxy deinit. Now, it is possible to specify a function to do so. ".release_ptr" field in the act_rule structure may be set during the configuration parsing to a specific deinit function depending on the action type.	2020-01-20 15:18:45 +01:00
Christopher Faulet	e00d06c99f	MINOR: http-rules: Handle all message rewrites the same way In HTTP rules, error handling during a rewrite is now handle the same way for all rules. First, allocation errors are reported as internal errors. Then, if soft rewrites are allowed, rewrite errors are ignored and only the failed_rewrites counter is incremented. Otherwise, when strict rewrites are mandatory, interanl errors are returned. For now, only soft rewrites are supported. Note also that the warning sent to notify a rewrite failure was removed. It will be useless once the strict rewrites will be possible.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a00071e2e5	MINOR: http-ana: Add a txn flag to support soft/strict message rewrites the HTTP_MSGF_SOFT_RW flag must now be set on the HTTP transaction to ignore rewrite errors on a message, from HTTP rules. The mode is called the soft rewrites. If thes flag is not set, strict rewrites are performed. In this mode, if a rewrite error occurred, an internal error is reported. For now, HTTP_MSGF_SOFT_RW is always set and there is no way to switch a transaction in strict mode.	2020-01-20 15:18:45 +01:00
Christopher Faulet	a08546bb5a	MINOR: counters: Remove failed_secu counter and use denied_resp instead The failed_secu counter is only used for the servers stats. It is used to report the number of denied responses. On proxies, the same info is stored in the denied_resp counter. So, it is more consistent to use the same field for servers.	2020-01-20 15:18:45 +01:00
Christopher Faulet	0159ee4032	MINOR: stats: Report internal errors in the proxies/listeners/servers stats The stats field ST_F_EINT has been added to report internal errors encountered per proxy, per listener and per server. It appears in the CLI export and on the HTML stats page.	2020-01-20 15:18:45 +01:00
Christopher Faulet	30a2a3724b	MINOR: http-rules: Add more return codes to let custom actions act as normal ones When HTTP/TCP rules are evaluated, especially HTTP ones, some results are possible for normal actions and not for custom ones. So missing return codes (ACT_RET_) have been added to let custom actions act as normal ones. Concretely following codes have been added: * ACT_RET_DENY : deny the request/response. It must be handled by the caller * ACT_RET_ABRT : abort the request/response, handled by action itsleft. * ACT_RET_INV : invalid request/response	2020-01-20 15:18:45 +01:00
Christopher Faulet	4d90db5f4c	MINOR: http-rules: Add a rule result to report internal error Now, when HTTP rules are evaluated, HTTP_RULE_RES_ERROR must be returned when an internal error is catched. It is a way to make the difference between a bad request or a bad response and an error during its processing.	2020-01-20 15:18:45 +01:00
Christopher Faulet	d4ce6c2957	MINOR: counters: Add a counter to report internal processing errors This counter, named 'internal_errors', has been added in frontend and backend counters. It should be used when a internal error is encountered, instead for failed_req or failed_resp.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cb5501327c	BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules Functions to deinitialize the HTTP rules are buggy. These functions does not check the action name to release the right part in the arg union. Only few info are released. For auth rules, the realm is released and there is no problem here. But the regex <arg.hdr_add.re> is always unconditionally released. So it is easy to make these functions crash. For instance, with the following rule HAProxy crashes during the deinit : http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)] For now, These functions are simply removed and we rely on the deinit function used for TCP rules (renamed as deinit_act_rules()). This patch fixes the bug. But arguments used by actions are not released at all, this part will be addressed later. This patch must be backported to all stable versions.	2020-01-20 15:18:45 +01:00
Willy Tarreau	ee1a6fc943	MINOR: connection: make the last arg of subscribe() a struct wait_event* The subscriber used to be passed as a "void param" that was systematically cast to a struct wait_event. By now it appears clear that the subscribe() call at every layer is well defined and always takes a pointer to an event subscriber of type wait_event, so let's enforce this in the functions' prototypes, remove the intermediary variables used to cast it and clean up the comments to clarify what all these functions do in their context.	2020-01-17 18:30:37 +01:00
Willy Tarreau	7872d1fc15	MEDIUM: connection: merge the send_wait and recv_wait entries In practice all callers use the same wait_event notification for any I/O so instead of keeping specific code to handle them separately, let's merge them and it will allow us to create new events later.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3a9312af8f	REORG: stream/backend: move backend-specific stuff to backend.c For more than a decade we've kept all the sess_update_st_*() functions in stream.c while they're only there to work in relation with what is currently being done in backend.c (srv_redispatch_connect, connect_server, etc). Let's move all this pollution over there and take this opportunity to try to find slightly less confusing names for these old functions whose role is only to handle transitions from one specific stream-int state: sess_update_st_rdy_tcp() -> back_handle_st_rdy() sess_update_st_con_tcp() -> back_handle_st_con() sess_update_st_cer() -> back_handle_st_cer() sess_update_stream_int() -> back_try_conn_req() sess_prepare_conn_req() -> back_handle_st_req() sess_establish() -> back_establish() The last one remained in stream.c because it's more or less a completion function which does all the initialization expected on a connection success or failure, can set analysers and emit logs. The other ones could possibly slightly benefit from being modified to take a stream-int instead since it's really what they're working with, but it's unimportant here.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3381bf89e3	MEDIUM: connection: get rid of CO_FL_CURR_* flags These ones used to serve as a set of switches between CO_FL_SOCK_* and CO_FL_XPRT_, and now that the SOCK layer is gone, they're always a copy of the last know CO_FL_XPRT_ ones that is resynchronized before I/O events by calling conn_refresh_polling_flags(), and that are pushed back to FDs when detecting changes with conn_xprt_polling_changes(). While these functions are not particularly heavy, what they do is totally redundant by now because the fd_want_/fd_stop_() actions already perform test-and-set operations to decide to create an entry or not, so they do the exact same thing that is done by conn_xprt_polling_changes(). As such it is pointless to call that one, and given that the only reason to keep CO_FL_CURR_* is to detect changes there, we can now remove them. Even if this does only save very few cycles, this removes a significant complexity that has been responsible for many bugs in the past, including the last one affecting FreeBSD. All tests look good, and no performance regressions were observed.	2020-01-17 17:45:12 +01:00
Willy Tarreau	e2a0eeca77	MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared by the stream-int when splicing is disabled, as well as in conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could be attempted by the I/O callbacks called from conn_fd_handler(). This clearing in conn_refresh_polling_flags() makes no sense anymore and is in no way related to the polling at all. Since we don't call them from there anymore it's better to clear it before attempting to receive, and to set it again later. So let's move this operation where it should be, in raw_sock_to_pipe() so that it's now symmetric. It was also placed in raw_sock_to_buf() so that we're certain that it gets cleared if an attempt to splice is replaced with a subsequent attempt to recv(). And these were currently already achieved by the call to conn_refresh_polling_flags(). Now it could theorically be removed from the stream-int.	2020-01-17 17:19:27 +01:00
Willy Tarreau	17ccd1a356	BUG/MEDIUM: connection: add a mux flag to indicate splice usability Commit `c640ef1a7d` ("BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible") fixed splicing in TCP and legacy mode but broke it badly in HTX mode. What happens in HTX mode is that the channel's to_forward value remains set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is not a reliable signal anymore to indicate whether more data are expected or not. Thus, when data are spliced out of the mux using rcv_pipe(), even when the end is reached (that only the mux knows about), the call to rcv_buf() to get the final HTX blocks completing the message were skipped and there was often no new event to wake this up, resulting in transfer timeouts at the end of large objects. All this goes down to the fact that the channel has no more information about whether it can splice or not despite being the one having to take the decision to call rcv_pipe() or not. And we cannot afford to call rcv_buf() inconditionally because, as the commit above showed, this reduces the forwarding performance by 2 to 3 in TCP and legacy modes due to data lying in the buffer preventing splicing from being used later. The approach taken by this patch consists in offering the muxes the ability to report a bit more information to the upper layers via the conn_stream. This information could simply be to indicate that more data are awaited but the real need being to distinguish splicing and receiving, here instead we clearly report the mux's willingness to be called for splicing or not. Hence the flag's name, CS_FL_MAY_SPLICE. The mux sets this flag when it knows that its buffer is empty and that data waiting past what is currently known may be spliced, and clears it when it knows there's no more data or that the caller must fall back to rcv_buf() instead. The stream-int code now uses this to determine if splicing may be used or not instead of looking at the rcv_pipe() callbacks through the whole chain. And after the rcv_pipe() call, it checks the flag again to decide whether it may safely skip rcv_buf() or not. All this bitfield dance remains a bit complex and it starts to appear obvious that splicing vs reading should be a decision of the mux based on permission granted by the data layer. This would however increase the API's complexity but definitely need to be thought about, and should even significantly simplify the data processing layer. The way it was integrated in mux-h1 will also result in no more calls to rcv_pipe() on chunked encoded data, since these ones are currently disabled at the mux level. However once the issue with chunks+splice is fixed, it will be important to explicitly check for curr_len\|CHNK to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk. This fix must be backported to 2.1 and 2.0.	2020-01-17 17:00:12 +01:00
Willy Tarreau	340b07e868	BUG/MAJOR: hashes: fix the signedness of the hash inputs Wietse Venema reported in the thread below that we have a signedness issue with our hashes implementations: due to the use of const char* for the input key that's often text, the crc32, sdbm, djb2, and wt6 algorithms return a platform-dependent value for binary input keys containing bytes with bit 7 set. This means that an ARM or PPC platform will hash binary inputs differently from an x86 typically. Worse, some algorithms are well defined in the industry (like CRC32) and do not provide the expected result on x86, possibly causing interoperability issues (e.g. a user-agent would fail to compare the CRC32 of a message body against the one computed by haproxy). Fortunately, and contrary to the first impression, the CRC32c variant used in the PROXY protocol processing is not affected. Thus the impact remains very limited (the vast majority of input keys are text-based, such as user-agent headers for exmaple). This patch addresses the issue by fixing all hash functions' prototypes (even those not affected, for API consistency). A reg test will follow in another patch. The vast majority of users do not use these hashes. And among those using them, very few will pass them on binary inputs. However, for the rare ones doing it, this fix MAY have an impact during the upgrade. For example if the package is upgraded on one LB then on another one, and the CRC32 of a binary input is used as a stick table key (why?) then these CRCs will not match between both nodes. Similarly, if "hash-type ... crc32" is used, LB inconsistency may appear during the transition. For this reason it is preferable to apply the patch on all nodes using such hashes at the same time. Systems upgraded via their distros will likely observe the least impact since they're expected to be upgraded within a short time frame. And it is important for distros NOT to skip this fix, in order to avoid distributing an incompatible implementation of a hash. This is the reason why this patch is tagged as MAJOR, eventhough it's extremely unlikely that anyone will ever notice a change at all. This patch must be backported to all supported branches since the hashes were introduced in 1.5-dev20 (commit `98634f0c`). Some parts may be dropped since implemented later. Link to Wietse's report: https://marc.info/?l=postfix-users&m=157879464518535&w=2	2020-01-16 08:23:42 +01:00
Willy Tarreau	f31af9367e	MEDIUM: lua: don't call the GC as often when dealing with outgoing connections In order to properly close connections established from Lua in case a Lua context dies, the context currently automatically gets a flag HLUA_MUST_GC set whenever an outgoing connection is used. This causes the GC to be enforced on the context's death as well as on yield. First, it does not appear necessary to do it when yielding, since if the connections die they are already cleaned up. Second, the problem with the flag is that even if a connection gets properly closed, the flag is not removed and the GC continues to be called on the Lua context. The impact on performance looks quite significant, as noticed and diagnosed by Sadasiva Gujjarlapudi in the following thread: https://www.mail-archive.com/haproxy@formilux.org/msg35810.html This patch changes the flag for a counter so that each created connection increments it and each cleanly closed connection decrements it. That way we know we have to call the GC on the context's death only if the count is non-null. As reported in the thread above, the Lua performance gain is now over 20% by doing this. Thanks to Sada and Thierry for the design discussion and tests that led to this solution.	2020-01-14 10:12:31 +01:00
Olivier Houchard	3c4f40acbf	BUG/MEDIUM: tasks: Use the MT macros in tasklet_free(). In tasklet_free(), to attempt to remove ourself, use MT_LIST_DEL, we can't just use LIST_DEL(), as we theorically could be in the shared tasklet list. This should be backported to 2.1.	2020-01-10 16:56:59 +01:00
Florian Tham	9205fea13a	MINOR: http: Add 404 to http-request deny This patch adds http status code 404 Not Found to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Florian Tham	272e29b5cc	MINOR: http: Add 410 to http-request deny This patch adds http status code 410 Gone to http-request deny. See issue #80.	2020-01-08 16:15:23 +01:00
Willy Tarreau	eaf05be0ee	OPTIM: polling: do not create update entries for FD removal In order to reduce the number of poller updates, we can benefit from the fact that modern pollers use sampling to report readiness and that under load they rarely report the same FD multiple times in a row. As such it's not always necessary to disable such FDs especially when we're almost certain they'll be re-enabled again and will require another set of syscalls. Now instead of creating an update for a (possibly temporary) removal, we only perform this removal if the FD is reported again as ready while inactive. In addition this is performed via another update so that alternating workloads like transfers have a chance to re-enable the FD without any syscall during the loop (typically after the data that filled a buffer have been sent). However we only do that for single- threaded FDs as the other ones require a more complex setup and are not on the critical path. This does cause a few spurious wakeups but almost totally eliminates the calls to epoll_ctl() on connections seeing intermitent traffic like HTTP/1 to a server or client. A typical example with 100k requests for 4 kB objects over 200 connections shows that the number of epoll_ctl() calls doesn't depend on the number of requests anymore but most exclusively on the number of established connections: Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 57.09 0.499964 0 654361 321190 recvfrom 38.33 0.335741 0 369097 1 epoll_wait 4.56 0.039898 0 44643 epoll_ctl 0.02 0.000211 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.875814 1068301 321391 total After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 59.25 0.504676 0 657600 323630 recvfrom 40.68 0.346560 0 374289 1 epoll_wait 0.04 0.000370 0 620 epoll_ctl 0.03 0.000228 1 200 200 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.851834 1032709 323831 total As expected there is also a slight increase of epoll_wait() calls since delaying de-activation of events can occasionally cause one spurious wakeup.	2019-12-27 16:38:47 +01:00
Willy Tarreau	19689882e6	MINOR: poller: do not call the IO handler if the FD is not active For now this almost never happens but with subsequent patches it will become more important not to uselessly call the I/O handlers if the FD is not active.	2019-12-27 16:38:47 +01:00
Willy Tarreau	0fbc318e24	CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE Both flags became equal in commit `82967bf9` ("MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the overlap between xprt_done_cb() and wake() after the removal of the DATA specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the "_DONE" version already covers everything and explains the intent well enough.	2019-12-27 16:38:47 +01:00
Willy Tarreau	4970e5adb7	REORG: connection: move tcp_connect_probe() to conn_fd_check() The function is not TCP-specific at all, it covers all FD-based sockets so let's move this where other similar functions are, in connection.c, and rename it conn_fd_check().	2019-12-27 16:38:43 +01:00
Willy Tarreau	11ef0837af	MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP In practice it's all pollers except select(). It turns out that we're keeping some legacy code only for select and enforcing it on all pollers, let's offer the pollers the ability to declare that they do not need that.	2019-12-27 14:04:33 +01:00
Lukas Tribus	a26d1e1324	BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled with the no-deprecated option. Remove existing, incomplete guards and add a compatibility macro in openssl-compat.h, just as OpenSSL does: `bf4006a6f9/include/openssl/ssl.h (L1486)` This should be backported as far as 2.0 and probably even 1.9.	2019-12-21 06:46:55 +01:00
Rosen Penev	b3814c2ca8	BUG/MINOR: ssl: openssl-compat: Fix getm_ defines LIBRESSL_VERSION_NUMBER evaluates to 0 under OpenSSL, making the condition always true. Check for the define before checking it. Signed-off-by: Rosen Penev <rosenp@gmail.com> [wt: to be backported as far as 1.9]	2019-12-20 16:01:31 +01:00
Willy Tarreau	dd0e89a084	BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing Since 1.9 with commit `b20aa9eef3` ("MAJOR: tasks: create per-thread wait queues") a task bound to a single thread will not use locks when being queued or dequeued because the wait queue is assumed to be the owner thread's. But there exists a rare situation where this is not true: the health check tasks may be running on one thread waiting for a response, and may in parallel be requeued by another thread calling health_adjust() after a detecting a response error in traffic when "observe l7" is set, and "fastinter" is lower than "inter", requiring to shorten the running check's timeout. In this case, the task being requeued was present in another thread's wait queue, thus opening a race during task_unlink_wq(), and gets requeued into the calling thread's wait queue instead of the running one's, opening a second race here. This patch aims at protecting against the risk of calling task_unlink_wq() from one thread while the task is queued on another thread, hence unlocked, by introducing a new TASK_SHARED_WQ flag. This new flag indicates that a task's position in the wait queue may be adjusted by other threads than then one currently executing it. This means that such WQ manipulations must be performed under a lock. There are two types of such tasks: - the global ones, using the global wait queue (technically speaking, those whose thread_mask has at least 2 bits set). - some local ones, which for now will be placed into the global wait queue as well in order to benefit from its lock. The flag is automatically set on initialization if the task's thread mask indicates more than one thread. The caller must also set it if it intends to let other threads update the task's expiration delay (e.g. delegated I/Os), or if it intends to change the task's affinity over time as this could lead to the same situation. Right now only the situation described above seems to be affected by this issue, and it is very difficult to trigger, and even then, will often have no visible effect beyond stopping the checks for example once the race is met. On my laptop it is feasible with the following config, chained to httpterm: global maxconn 400 # provoke FD errors, calling health_adjust() defaults mode http timeout client 10s timeout server 10s timeout connect 10s listen px bind :8001 option httpchk /?t=50 server sback 127.0.0.1:8000 backup server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7 This patch will automatically address the case for the checks because check tasks are created with multiple threads bound and will get the TASK_SHARED_WQ flag set. If in the future more tasks need to rely on this (multi-threaded muxes for example) and the use of the global wait queue becomes a bottleneck again, then it should not be too difficult to place locks on the local wait queues and queue the task on its bound thread. This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task". Many thanks to William Dauchy for providing detailed traces allowing to spot the problem.	2019-12-19 14:42:22 +01:00
Christopher Faulet	76014fd118	MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state During H1 parsing, the HTX EOM block is added before switching the message state to H1_MSG_DONE. It is an exception in the way to convert an H1 message to HTX. Except for this block, the message is first switched to the right state before starting to add the corresponding HTX blocks. For instance, the message is switched in H1_MSG_DATA state and then the HTX DATA blocks are added. With this patch, the message is switched to the H1_MSG_DONE state when all data blocks or trailers were processed. It is the caller responsibility to call h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far easier to catch failures when the HTX buffer is full. The H1 and FCGI muxes have been updated accordingly. This patch may eventually be backported to 2.1 if it helps other backports.	2019-12-11 16:46:16 +01:00
Willy Tarreau	fec56c6a76	BUG/MINOR: listener: fix off-by-one in state name check As reported in issue #380, the state check in listener_state_str() is invalid as it allows state value 9 to report crap. We don't use such a state value so the issue should never happen unless the memory is already corrupted, but better clean this now while it's harmless. This should be backported to all maintained branches.	2019-12-11 15:51:37 +01:00
Willy Tarreau	d26c9f9465	BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers If a new process is started with -sf and it fails to bind, it may send a SIGTTOU to the master process in hope that it will temporarily unbind. Unfortunately this one doesn't catch it and stops to background instead of forwarding the signal to the workers. The same is true for SIGTTIN. This commit simply implements an extra signal handler for the master to deal with such signals that must be passed down to the workers. It must be backported as far as 1.8, though there the code differs in that it's entirely in haproxy.c and doesn't require an extra sig handler.	2019-12-11 14:26:53 +01:00
Willy Tarreau	c49ba52524	MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups We used to have wake_expired_tasks() wake up tasks and return the next expiration delay. The problem this causes is that we have to call it just before poll() in order to consider latest timers, but this also means that we don't wake up all newly expired tasks upon return from poll(), which thus systematically requires a second poll() round. This is visible when running any scheduled task like a health check, as there are systematically two poll() calls, one with the interval, nothing is done after it, and another one with a zero delay, and the task is called: listen test bind *:8001 server s1 127.0.0.1:1111 check 09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0 09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0 09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0 >> nothing run here, as the expired task was not woken up yet. 09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0 09:37:39.202505 epoll_wait(3, [], 200, 0) = 0 09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0 >> now the expired task was woken up 09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0 09:37:39.202683 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0 09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:37:39.202715 close(7) = 0 Let's instead split the function in two parts: - the first part, wake_expired_tasks(), called just before process_runnable_tasks(), wakes up all expired tasks; it doesn't compute any timeout. - the second part, next_timer_expiry(), called just before poll(), only computes the next timeout for the current thread. Thanks to this, all expired tasks are properly woken up when leaving poll, and each poll call's timeout remains up to date: 09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0 09:41:16.270457 epoll_wait(3, [], 200, 999) = 0 09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0 09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0 09:41:17.270323 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0 09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:41:17.270367 close(7) = 0 This may be backported to 2.1 and 2.0 though it's unlikely to bring any user-visible improvement except to clarify debugging.	2019-12-11 09:42:58 +01:00
Willy Tarreau	440d09b244	BUG/MINOR: tasks: only requeue a task if it was already in the queue Commit `0742c314c3` ("BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().") had a slight side effect on expired timeouts, which is that when used before a timeout is updated, it will cause an existing task to be requeued earlier than its expected timeout when done before being updated, resulting in the next poll wakup timeout too early or even instantly if the previous wake up was done on a timeout. This is visible in strace when health checks are enabled because there are two poll calls, one of which has a short or zero delay. The correct solution is to only requeue a task if it was already in the queue. This can be backported to all branches having the fix above.	2019-12-11 09:21:36 +01:00
Willy Tarreau	a1d97f88e0	REORG: listener: move the global listener queue code to listener.c The global listener queue code and declarations were still lying in haproxy.c while not needed there anymore at all. This complicates the code for no reason. As a result, the global_listener_queue_task and the global_listener_queue were made static.	2019-12-10 14:16:03 +01:00
Willy Tarreau	241797a3fc	MINOR: listener: split dequeue_all_listener() in two We use it half times for the global_listener_queue and half times for a proxy's queue and this requires the callers to take care of these. Let's split it in two versions, the current one working only on the global queue and another one dedicated to proxies for the per-proxy queues. This cleans up quite a bit of code.	2019-12-10 14:14:09 +01:00
Willy Tarreau	a45a8b5171	MEDIUM: init: set NO_NEW_PRIVS by default when supported HAProxy doesn't need to call executables at run time (except when using external checks which are strongly recommended against), and is even expected to isolate itself into an empty chroot. As such, there basically is no valid reason to allow a setuid executable to be called without the user being fully aware of the risks. In a situation where haproxy would need to call external checks and/or disable chroot, exploiting a vulnerability in a library or in haproxy itself could lead to the execution of an external program. On Linux it is possible to lock the process so that any setuid bit present on such an executable is ignored. This significantly reduces the risk of privilege escalation in such a situation. This is what haproxy does by default. In case this causes a problem to an external check (for example one which would need the "ping" command), then it is possible to disable this protection by explicitly adding this directive in the global section. If enabled, it is possible to turn it back off by prefixing it with the "no" keyword. Before the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" uid=0(root) gid=0(root) groups=0(root After the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?	2019-12-06 17:20:26 +01:00
Olivier Houchard	0742c314c3	BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity(). In task_set_affinity(), leave the wait_queue if any before changing the affinity, and re-enter a wait queue once it is done. If we don't do that, the task may stay in the wait queue of another thread, and we later may end up modifying that wait queue while holding no lock, which could lead to memory corruption. THis should be backported to 2.1, 2.0 and 1.9.	2019-12-05 15:11:19 +01:00
Willy Tarreau	d96f1126fe	MEDIUM: init: prevent process and thread creation at runtime Some concerns are regularly raised about the risk to inherit some Lua files which make use of a fork (e.g. via os.execute()) as well as whether or not some of bugs we fix might or not be exploitable to run some code. Given that haproxy is event-driven, any foreground activity completely stops processing and is easy to detect, but background activity is a different story. A Lua script could very well discretely fork a sub-process connecting to a remote location and taking commands, and some injected code could also try to hide its activity by creating a process or a thread without blocking the rest of the processing. While such activities should be extremely limited when run in an empty chroot without any permission, it would be better to get a higher assurance they cannot happen. This patch introduces something very simple: it limits the number of processes and threads to zero in the workers after the last thread was created. By doing so, it effectively instructs the system to fail on any fork() or clone() syscall. Thus any undesired activity has to happen in the foreground and is way easier to detect. This will obviously break external checks (whose concept is already totally insecure), and for this reason a new option "insecure-fork-wanted" was added to disable this protection, and it is suggested in the fork() error report from the checks. It is obviously recommended not to use it and to reconsider the reasons leading to it being enabled in the first place. If for any reason we fail to disable forks, we still start because it could be imaginable that some operating systems refuse to set this limit to zero, but in this case we emit a warning, that may or may not be reported since we're after the fork point. Ideally over the long term it should be conditionned by strict-limits and cause a hard fail.	2019-12-03 11:49:00 +01:00
Emmanuel Hocdet	e9a100e982	BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0 Commit `d4f9a60e` "MINOR: ssl: deduplicate ca-file" uses undeclared X509 functions when build with openssl < 1.1.0. Introduce this functions in openssl-compat.h . Fix issue #385.	2019-12-03 07:13:12 +01:00
Emmanuel Hocdet	d4f9a60ee2	MINOR: ssl: deduplicate ca-file Typically server line like: 'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt' load ca-certificates.crt 1000 times and stay duplicated in memory. Same case for bind line: ca-file is loaded for each certificate. Same 'ca-file' can be load one time only and stay deduplicated in memory. As a corollary, this will prevent file access for ca-file when updating a certificate via CLI.	2019-11-28 11:11:20 +01:00
Willy Tarreau	cdb27e8295	MINOR: version: this is development again, update the status It's basically a revert of commit `9ca7f8cea`.	2019-11-25 20:38:32 +01:00
Willy Tarreau	2e077f8d53	[RELEASE] Released version 2.2-dev0 Released version 2.2-dev0 with the following main changes : - exact copy of 2.1.0	2019-11-25 20:36:16 +01:00
Willy Tarreau	9ca7f8ceac	MINOR: version: indicate that this version is stable Also indicate that it will get fixes till ~Q1 2021.	2019-11-25 19:47:23 +01:00
Willy Tarreau	c22d5dfeb8	MINOR: h2: add a function to report H2 error codes as strings Just like we have frame type to string, let's have error to string to improve debugging and traces.	2019-11-25 11:34:26 +01:00
Willy Tarreau	8f3ce06f14	MINOR: ist: add ist_find_ctl() This new function looks for the first control character in a string (a char whose value is between 0x00 and 0x1F included) and returns it, or NULL if there is none. It is optimized for quickly evicting non-matching strings and scans ~0.43 bytes per cycle. It can be used as an accelerator when it's needed to look up several of these characters (e.g. CR/LF/NUL).	2019-11-25 10:33:35 +01:00
Willy Tarreau	47479eb0e7	MINOR: version: emit the link to the known bugs in output of "haproxy -v" The link to the known bugs page for the current version is built and reported there. When it is a development version (less than 2 dots), instead a link to github open issues is reported as there's no way to be sure about the current situation in this case and it's better that users report their trouble there.	2019-11-21 18:48:20 +01:00
Willy Tarreau	08dd202d73	MINOR: version: report the version status in "haproxy -v" As discussed on Discourse here: https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466 it's not always easy for end users to know the lifecycle of the version they are using. This patch introduces a "Status" line in the output of "haproxy -vv" indicating whether it's a development, stable, long-term supported version, possibly with an estimated end of life for the branch when it can be anticipated (e.g. for stable versions). This field should be adjusted when creating a major release to reflect the new status. It may make sense to backport this to other branches to clarify the situation.	2019-11-21 18:47:54 +01:00
William Lallemand	8b453912ce	MINOR: ssl: ssl_sock_prepare_ctx() return an error code Rework ssl_sock_prepare_ctx() so it fills a buffer with the error messages instead of using ha_alert()/ha_warning(). Also returns an error code (ERR_*) instead of the number of errors.	2019-11-21 17:48:11 +01:00
Daniel Corbett	f8716914c7	MEDIUM: dns: Add resolve-opts "ignore-weight" It was noted in #48 that there are times when a configuration may use the server-template directive with SRV records and simultaneously want to control weights using an agent-check or through the runtime api. This patch adds a new option "ignore-weight" to the "resolve-opts" directive. When specified, any weight indicated within an SRV record will be ignored. This is for both initial resolution and ongoing resolution.	2019-11-21 17:25:31 +01:00
Fr�d�ric L�caille	ec1c10b839	MINOR: peers: Add debugging information to "show peers". This patch adds three counters to help in debugging peers protocol issues to "peer" struct: ->no_hbt counts the number of reconnection period without receiving heartbeat ->new_conn counts the number of reconnections after ->reconnect timeout expirations. ->proto_err counts the number of protocol errors.	2019-11-19 14:48:28 +01:00
Fr�d�ric L�caille	33cab3c0eb	MINOR: peers: Add TX/RX heartbeat counters. Add RX/TX heartbeat counters to "peer" struct to have an idead about which peer is alive or not. Dump these counters values on the CLI via "show peers" command.	2019-11-19 14:48:25 +01:00

... 2 3 4 5 6 ...

4144 Commits