haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-12-16 07:01:38 +01:00

Author	SHA1	Message	Date
Willy Tarreau	cf8be50a3d	MINOR: debug: report in port_mortem whether a container was detected Containers often cause significant trouble depending on how they're set up, and they're not always trivial for their users to extract info from. Here we're trying to detect if we're running inside a container on Linux. There are plenty of approaches and none is perfectly clean nor reliable, which makes sense since the goal is to remain transparent enough. One interesting approach is to rely on the observation that containers generally do not expose most kernel threads, and that the very firsts of them are extremely stable across all kernel versions: pid 2 was called "keventd" in kernel 2.4, became "kthreadd" in kernel 2.6, and has since not changed. This is true on all architectures tested, even with highly stripped down kernels such as those found on 15 year-old OpenWRT images. And this one doesn't appear inside containers. Thus here we check if we find such a thread via /proc and whether it's called keventd or kthreadd, to detect a container, and we set the "cont_techno" variable to "yes" or "no" depending on what is found.	2023-11-23 15:39:21 +01:00
Willy Tarreau	4e3f9921de	MINOR: debug: add OS/hardware info to the post_mortem struct Let's extract some info about the system (board model, vendor etc), this will indicate some hypervisors, some cloud instances or some uncommon embedded boards etc. Typically, vmware, qemu and raspberry-pi are visible here and can help during the troubleshooting session.	2023-11-23 15:39:21 +01:00
Willy Tarreau	0184597522	MINOR: debug: start to create a new struct post_mortem The goal here is to accumulate precious debugging information in a struct that is easy to find in memory. It's aligned to 256-byte as it also helps. We'll progressively add a lot of info about the startup conditions, the operating system, the hardware and hypervisor so as to limit the number of round trips between developers and users during debugging sessions. Also, opening a core file with an hex editor should often be sufficient to extract most of the info. In addition, a new "show dev" command will show these information so that they can be checked at runtime without having to wait for a crash (e.g. if a limit is bad in a container, better know it early). For now the struct only contains utsname that's fed at boot time.	2023-11-23 15:39:21 +01:00
Willy Tarreau	2268f10dd6	DEBUG: tinfo: store the pthread ID and the stack pointer in tinfo When debugging a core, it's difficult to match a given gdb thread number against an internal thread. Let's just store the pthread ID and the stack pointer in each tinfo. This could help in the future by allowing to just glance over them and pick the right one depending what info is found first.	2023-11-23 14:32:55 +01:00
Willy Tarreau	53da8bfcb6	BUG/MINOR: server: do not leak default-server in defaults sections When a default-server directive is used in a defaults section, it's never freed and the "defaults" proxy gets reset without freeing the fields from that default-server. Normally there are no allocation there, except for the config file location stored in srv->conf.file form an strdup() since commit 9394a9444 ("REORG: server: move alert traces in parse_server") that appeared in 2.4. In addition, if a "default-server" directive appears multiple times in a defaults section, one more entry will be leaked per call. This commit addresses this by checking that we don't overwrite the file upon multiple calls, and by clearing it when resetting the default proxy. This should be backported to 2.4.	2023-11-23 14:32:55 +01:00
Frédéric Lécaille	7fc52357cb	BUG/MINOR: quic: Possible RX packet memory leak under heavy load This bug could be reproduced with -dMfail and h2load generating plenty of connections. A "show pools" CLI command showed that some memory in relation with RX packet pool was never release. Furthermore, adding a RX packet counter to each connection and a BUG_ON() in quic_conn_release() has proved that this unreleased memory was in relation with RX packet which were not linked to a connection. The responsible is quic_dgram_parse() which does not release some RX packet memory before exiting after the connection thread affinity has changed. Must be backported as far as 2.7.	2023-11-22 18:03:26 +01:00
Frédéric Lécaille	cd225da46c	BUG/MINOR: quic: Possible leak of TX packets under heavy load This bug could be reproduced with -dMfail and detected added a counter of TX packet to the QUIC connection. When released calling quic_conn_release() the connection should have a null counter of TX packets. This was not always the case. This could occur during the handshake step: a first packet was built, then another one should have followed in the same datagram, but fail due to a memory allocation issue. As the datagram length and first TX packet were not written in the TX buffer, this latter could not really be purged by qc_purge_tx_buf() even if called. This bug occured only when building coalesced packets in the same datagram. To fix this, write the packet information (datagram length and first packet address) in the TX buffer before purging it. Must be backported as far as 2.6.	2023-11-22 18:03:26 +01:00
Frédéric Lécaille	dc8a20b317	BUG/MEDIUM: quic: Possible crash during retransmissions and heavy load This bug could be reproduced with -dMfail and dectected by libasan as follows: $ ASAN_OPTIONS=disable_coredump=0:unmap_shadow_on_exit=1:abort_on_error=f quic-freeze.cfg -dMfail -dMno-cache -dM0x55 ================================================================= ==82989==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc 0x560790cc4749 bp 0x7fff8e0e8e30 sp 0x7fff8e0e8e28 WRITE of size 8 at 0x7fff8e0ea338 thread T0 #0 0x560790cc4748 in qc_frm_free src/quic_frame.c:1222 #1 0x560790cc5260 in qc_release_frm src/quic_frame.c:1261 #2 0x560790d1de99 in qc_treat_acked_tx_frm src/quic_rx.c:312 #3 0x560790d1e708 in qc_ackrng_pkts src/quic_rx.c:370 #4 0x560790d22a1d in qc_parse_ack_frm src/quic_rx.c:694 #5 0x560790d25daa in qc_parse_pkt_frms src/quic_rx.c:988 #6 0x560790d2a509 in qc_treat_rx_pkts src/quic_rx.c:1373 #7 0x560790c72d45 in quic_conn_io_cb src/quic_conn.c:906 #8 0x560791207847 in run_tasks_from_lists src/task.c:596 #9 0x5607912095f0 in process_runnable_tasks src/task.c:876 #10 0x560791135564 in run_poll_loop src/haproxy.c:2966 #11 0x5607911363af in run_thread_poll_loop src/haproxy.c:3165 #12 0x56079113938c in main src/haproxy.c:3862 #13 0x7f92606edd09 in __libc_start_main ../csu/libc-start.c:308 #14 0x560790bcd529 in _start (/home/flecaille/src/haproxy/haproxy+0x Address 0x7fff8e0ea338 is located in stack of thread T0 at offset 1032 i #0 0x560790d29b52 in qc_treat_rx_pkts src/quic_rx.c:1341 This frame has 2 object(s): [32, 48) 'ar' (line 1380) [64, 1088) '_msg' (line 1368) <== Memory access at offset 1032 is inable HINT: this may be a false positive if your program uses some custom stacnism, swapcontext or vfork (longjmp and C++ exceptions are supported) SUMMARY: AddressSanitizer: stack-use-after-scope src/quic_frame.c:1222 i Shadow bytes around the buggy address: 0x100071c15410: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15420: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15430: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15440: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15450: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 =>0x100071c15460: f8 f8 f8 f8 f8 f8 f8[f8]f8 f8 f8 f8 f8 f8 f3 f3 0x100071c15470: f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 00 00 0x100071c15480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100071c15490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100071c154a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100071c154b0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f3 f3 f3 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==82989==ABORTING AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL Aborted (core dumped) Note that a coredump could not always be produced with all compilers. This was always the case with clang 11. When allocating frames to be retransmitted from qc_dgrams_retransmit(), if they could not be sent for any reason, they could remain attached to a local list to qc_dgrams_retransmit() and trigger a crash with libasan when releasing the original frames they were duplicated from. To fix this, always release the frames which could not be sent during retransmissions calling qc_free_frm_list() where needed. Must be backported as far as 2.6.	2023-11-22 18:03:26 +01:00
Frédéric Lécaille	34bc100b8f	MINOR: quic: Add traces to debug frames handling during retransmissions This is really boring to not know why some retransmissions could not be done from qc_prep_hpkts() which allocates frames, prepare packets and send them. Especially to not know about if frames are not remaining allocated and attached to list on the stack. This patch already helped in diagnosing such an issue during "-dMfail" tests.	2023-11-22 18:03:26 +01:00
Willy Tarreau	8f9e94ecff	BUILD: log: silence a build warning when threads are disabled Building without threads emits two warnings because the proxy pointer is no longer used (only serves for the lock) since 2.9 commit 9a74a6cb1 ("MAJOR: log: introduce log backends"). No backport is needed.	2023-11-22 11:21:07 +01:00
Amaury Denoyelle	89da4e9e5d	MINOR: acl: define explicit HTTP_3.0 Some ACL shortcuts are defined to match HTTP requests by their version. This exists for HTTP_1.0 to HTTP_2.0. This patch adds HTTP_3.0 definition.	2023-11-20 18:01:07 +01:00
Amaury Denoyelle	decf29d06d	MINOR: quic: remove unneeded QUIC specific stopping function On CONNECTION_CLOSE reception/emission, QUIC connections enter CLOSING state. At this stage, only CONNECTION_CLOSE can be reemitted and all other exchanges are stopped. Previously, on haproxy process stopping, if all QUIC connections were in CLOSING state, they were released before their closing timer expiration to not block the process shutdown. However, since a recent commit, the closing timer has been shorten to a more reasonable delay. It is now consider viable to respect connections closing state even on process shutdown. As such, stopping specific code in QUIC connections idle timer task was removed. A specific function quic_handle_stopping() was implemented to notify QUIC connections on shutdown from main() function. It should have been deleted along the removal in QUIC idle timer task. This patch just does this.	2023-11-20 17:59:52 +01:00
Frédéric Lécaille	756b3c5f7b	BUG/MEDIUM: quic: Possible crash for connections to be killed The connections are flagged as "to be killed" asap when the peer has left (detected by sendto() "Connection refused" errno) by qc_kill_conn(). This function has to wakeup the idle timer task to release the connection (and the idle timer and the idle timer task itself). Then if in the meantime the connection was flagged as having to process some retransmissions, some packet could lead to sendto() errors again with a call to qc_kill_conn(), this time with a released idle timer task. This bug could be detected by libasan as follows: .AddressSanitizer:DEADLYSIGNAL ================================================================= ==21018==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x 560b5d898717 bp 0x7f9aaac30000 sp 0x7f9aaac2ff80 T3) ==21018==The signal is caused by a READ memory access. ==21018==Hint: address points to the zero page. . #0 0x560b5d898717 in _task_wakeup include/haproxy/task.h:209 #1 0x560b5d8a563c in qc_kill_conn src/quic_conn.c:171 #2 0x560b5d97f832 in qc_send_ppkts src/quic_tx.c:636 #3 0x560b5d981b53 in qc_send_app_pkts src/quic_tx.c:876 #4 0x560b5d987122 in qc_send_app_probing src/quic_tx.c:910 #5 0x560b5d987122 in qc_dgrams_retransmit src/quic_tx.c:1397 #6 0x560b5d8ab250 in quic_conn_app_io_cb src/quic_conn.c:712 #7 0x560b5de41593 in run_tasks_from_lists src/task.c:596 #8 0x560b5de4333c in process_runnable_tasks src/task.c:876 #9 0x560b5dd6f2b0 in run_poll_loop src/haproxy.c:2966 #10 0x560b5dd700fb in run_thread_poll_loop src/haproxy.c:3165 #11 0x7f9ab9188ea6 in start_thread nptl/pthread_create.c:477 #12 0x7f9ab90a8a2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV include/haproxy/task.h:209 in _task_wakeup Thread T3 created by T0 here: #0 0x7f9ab97ac2a2 in __interceptor_pthread_create ../../../../src/libsaniti zer/asan/asan_interceptors.cpp:214 #1 0x560b5df4f3ef in setup_extra_threads src/thread.c:252 o #2 0x560b5dd730c7 in main src/haproxy.c:3856 #3 0x7f9ab8fd0d09 in __libc_start_main ../csu/libc-start.c:308 i ==21018==ABORTING AddressSanitizer:DEADLYSIGNAL Aborted (core dumped) To fix, simply reset the connection flag QUIC_FL_CONN_RETRANS_NEEDED to cancel the retransmission when qc_kill_conn is called. Note that this new bug arrived with this fix which is correct and flagged as to be backported as far as 2.6. BUG/MINOR: quic: idle timer task requeued in the past Must be backported as far as 2.6.	2023-11-20 17:17:16 +01:00
Amaury Denoyelle	a8968701c0	BUG/MAJOR: quic: complete thread migration before tcp-rules A quic_conn is instantiated and tied on the first thread which has received the first INITIAL packet. After handshake completion, listener_accept() is called. For each quic_conn, a new thread is selected among the least loaded ones Note that this occurs earlier if handling 0-RTT data. This thread connection migration is done in two steps : * inside listener_accept(), on the origin thread, quic_conn tasks/tasklet are killed. After this, no quic_conn related processing will occur on this thread. The connection is flagged with QUIC_FL_CONN_AFFINITY_CHANGED. * as soon as the first quic_conn related processing occurs on the new thread, the migration is finalized. This allows to allocate the new tasks/tasklet directly on the destination thread. This last step on the new thread must be done prior to other quic_conn access. There is two events which may trigger it : * a packet is received on the new thread. In this case, qc_finalize_affinity_rebind() is called from quic_dgram_parse(). * the recently accepted connection is popped from accept_queue_ring via accept_queue_process(). This will called session_accept_fd() as listener.bind_conf.accept callback. This instantiates a new session and start connection stack via conn_xprt_start(), which itself calls qc_xprt_start() where qc_finalize_affinity_rebind() is used. A condition was recently found which could cause a closing to be used with qc_finalize_affinity_rebind() which is forbidden with a BUG_ON(). This lat step was not compatible with layer 4 rule such as "tcp-request connection reject" which closes the connection early. In this case, most of the body of session_accept_fd() is skipped, including qc_xprt_start(), so thread migration is not finalized. At the end of the function, conn_xprt_close() is then called which flags the connection as CLOSING. If a datagram is received for this connection before it is released, this will call qc_finalize_affinity_rebind() which triggers its BUG_ON() to prevent thread migration for CLOSING quic_conn. FATAL: bug condition "qc->flags & ((1U << 29)\|(1U << 30))" matched at src/quic_conn.c:2036 Thread 3 "haproxy" received signal SIGILL, Illegal instruction. [Switching to Thread 0x7ffff794f700 (LWP 2973030)] 0x00005555556221f3 in qc_finalize_affinity_rebind (qc=0x7ffff002d060) at src/quic_conn.c:2036 2036 BUG_ON(qc->flags & (QUIC_FL_CONN_CLOSING\|QUIC_FL_CONN_DRAINING)); (gdb) bt #0 0x00005555556221f3 in qc_finalize_affinity_rebind (qc=0x7ffff002d060) at src/quic_conn.c:2036 #1 0x0000555555682463 in quic_dgram_parse (dgram=0x7fff5003ef10, from_qc=0x0, li=0x555555f38670) at src/quic_rx.c:2602 #2 0x0000555555651aae in quic_lstnr_dghdlr (t=0x555555fc4440, ctx=0x555555fc3f78, state=32832) at src/quic_sock.c:189 #3 0x00005555558c9393 in run_tasks_from_lists (budgets=0x7ffff7944c90) at src/task.c:596 #4 0x00005555558c9e8e in process_runnable_tasks () at src/task.c:876 #5 0x000055555586b7b2 in run_poll_loop () at src/haproxy.c:2966 #6 0x000055555586be87 in run_thread_poll_loop (data=0x555555d3d340 <ha_thread_info+64>) at src/haproxy.c:3165 #7 0x00007ffff7b59609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #8 0x00007ffff7a7e133 in clone () from /lib/x86_64-linux-gnu/libc.so.6 To fix this issue, ensure quic_conn migration is completed earlier inside session_accept_fd(), before any tcp rules processing. This is done by moving qc_finalize_affinity_rebind() invocation from qc_xprt_start() to qc_conn_init(). This must be backported up to 2.7.	2023-11-20 16:11:26 +01:00
Willy Tarreau	3e913909e7	BUILD: cache: fix build error on older compilers pre-c99 compilers will fail to build the cache since commit 48f81ec09 ("MAJOR: cache: Delay cache entry delete in reserve_hot function") due to an int declaration in the for loop. No backport is needed.	2023-11-20 11:43:52 +01:00
Willy Tarreau	445fc1fe3a	BUG/MINOR: sock: mark abns sockets as non-suspendable and always unbind them In 2.3, we started to get a cleaner socket unbinding mechanism with commit f58b8db47 ("MEDIUM: receivers: add an rx_unbind() method in the protocols"). This mechanism rightfully refrains from unbinding when sockets are expected to be transferrable to another worker via "expose-fd listeners", but this is not compatible with ABNS sockets, which do not support reuseport, unbinding nor being renamed: in short they will always prevent a new process from binding. It turns out that this is not much visible because by pure accident, GTUNE_SOCKET_TRANSFER is only set in the code dealing with master mode and deamons, so it's never set in foreground mode nor in tests even if present on the stats socket. However with master mode, it is now always set even when not present on the stats socket, and will always conflict. The only reasonable approach seems to consist in marking these abns sockets as non-suspendable so that the generic sock_unbind() code can decide to just unbind them regardless of GTUNE_SOCKET_TRANSFER. This should carefully be backported as far as 2.4.	2023-11-20 11:38:26 +01:00
William Lallemand	ef9a195742	BUG/MINOR: startup: set GTUNE_SOCKET_TRANSFER correctly This bug was forbidding the GTUNE_SOCKET_TRANSFER option to be set when haproxy is neither in daemon mode nor in mworker mode. So it basically only impacts the foreground mode. The fix moves the code outside the 'if (global.mode & (MODE_DAEMON \| MODE_MWORKER \| MODE_MWORKER_WAIT))' condition. Bug was introduced with 7f80eb23 ("MEDIUM: proxy: zombify proxies only when the expose-fd socket is bound"). Must be backported in every stable version.	2023-11-20 10:49:05 +01:00
Aurelien DARRAGON	82f4bcafae	MINOR: log/backend: prevent "dynamic-cookie-key" use with LOG mode It doesn't make sense to set "dynamic-cookie-key" inside a log backend, thus we report a warning to the user and reset the setting.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	c7783fb32b	MINOR: log/backend: prevent "http-send-name-header" use with LOG mode It doesn't make sense to use the "http-send-name-header" directive inside a log backend so we report a warning in with case and reset the setting.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	4b2616f784	MINOR: log/backend: prevent stick table and stick rules with LOG mode Report a warning and prevent errors if user tries to declare a stick table or use stick rules within a log backend.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	5335618967	MINOR: log/backend: prevent tcp-{request,response} use with LOG mode We start implementing some postparsing compatibility checks for log backends. Here we report a warning if user tries to use tcp-{request,response} rules with log backend, and we properly ignore such rules when inherited from defaults section.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	6a29888f60	MINOR: log/backend: ensure log exclusive params are not used in other modes add proxy_cfg_ensure_no_log() function (similar to proxy_cfg_ensure_no_http()) to ensure at the end of proxy parsing that no log exclusive options are found if the proxy is not in log mode.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	42d7d1bd47	Revert "MINOR: filter: "filter" requires TCP or HTTP mode" This reverts commit f9422551cd2b205332e4ea4e6195ed986e0e198a since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	c8948fb7ac	Revert "MINOR: flt_http_comp: "compression" requires TCP or HTTP mode" This reverts commit 225526dc16949ccbc83f59378d644eb6bda7681c since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	33e5c4055f	Revert "MINOR: http_htx/errors: prevent the use of some keywords when not in tcp/http mode" This reverts commit b41b77b4ccfd71647f469065006310772f4911a3 since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	0f9b475333	Revert "MINOR: fcgi-app: "use-fcgi-app" requires TCP or HTTP mode" This reverts commit 0ba731f50b4e6b75d32ddf8388fe32fad5cfadf3 since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	7d59730100	Revert "MINOR: cfgparse-listen: "http-reuse" requires TCP or HTTP mode" This reverts commit 65f1124b5dd4f01142337318f16968a33c2146ed since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	f1a072d077	Revert "MINOR: cfgparse-listen: "dynamic-cookie-key" requires TCP or HTTP mode" This reverts commit 0b09727a22fc2bd36bc7b3e73ca9c5a85ce2601c since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	a0a7dd1ee7	Revert "MINOR: cfgparse-listen: "http-send-name-header" requires TCP or HTTP mode" This reverts commit d354947365bbbafe3f6675fec0bea8617259842a since we cannot perform the test during parsing as the proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	c90d7dc46b	Revert "MINOR: stktable: "stick" requires TCP or HTTP mode" This reverts commit 098ae743fd17b3fae6671e53d9bdb74eb3f315fd since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	8e20fdbb1c	Revert "MINOR: tcp_rules: tcp-{request,response} requires TCP or HTTP mode" This reverts commit 09b15e4163b6a32a418b6f8b8e29dfb356d5fee6 since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	b6e1e9ec8b	Revert "MINOR: proxy: report a warning for max_ka_queue in proxy_cfg_ensure_no_http()" This reverts commit 3934901 since it makes no sense to report a warning in this case given that max-keepalive-queue will also work with TCP backends.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	b61147fd2a	MEDIUM: log/balance: merge tcp/http algo with log ones "log-balance" directive was recently introduced to configure the balancing algorithm to use when in a log backend. However, it is confusing and it causes issues when used in default section. In this patch, we take another approach: first we remove the "log-balance" directive, and instead we rely on existing "balance" directive to configure log load balancing in log backend. Some algorithms such as roundrobin can be used as-is in a log backend, and for log-only algorithms, they are implemented as "log-$name" inside the "backend" directive. The documentation was updated accordingly.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	2c4943c18b	BUG/MINOR: proxy/stktable: missing frees on proxy cleanup In 1b8e68e ("MEDIUM: stick-table: Stop handling stick-tables as proxies.") we forgot to free the table pointer which is now dynamically allocated. Let's take this opportunity to also fix a missing free in the table itself (the table expire task wasn't properly destroyed) This patch depends on: - "MINOR: stktable: add sktable_deinit function" It should be backported in every stable versions.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	e10cf61099	MINOR: stktable: add stktable_deinit function Adding sktable_deinit() helper function to properly cleanup a sticktable that was initialized using stktable_init().	2023-11-18 11:16:21 +01:00
Willy Tarreau	6c7771f1b4	MINOR: stream/cli: add another filter "susp" to "show sess" This one reports streams considered as "suspicious", i.e. those with no expiration dates or dates in the past, or those without a front endpoint. More criteria could be added in the future.	2023-11-17 19:30:07 +01:00
Willy Tarreau	3ffcf7beb1	MINOR: stream/cli: add an optional "older" filter for "show sess" It's often needed to be able to refine "show sess" when debugging, and very often a first glance at old streams is performed, but that's a difficult task in large dumps, and it takes lots of resources to dump everything. This commit adds "older <age>" to "show sess" in order to specify the minimum age of streams that will be dumped. This should simplify the identification of blocked ones.	2023-11-17 19:30:04 +01:00
Willy Tarreau	ec76e0138b	BUG/MINOR: stream/cli: report correct stream age in "show sess" Since 2.4-dev2 with commit 15e525f49 ("MINOR: stream: Don't retrieve anymore timing info from the mux csinfo"), we don't replace the tv_accept (now accept_ts) anymore with the current request's, so that it properly reflects the session's accept date and not the request's date. However, since then we failed to update "show sess" to make use of the request's timestamp instead of the session's timestamp, resulting in fantasist values in the "age" field of "show sess" for the task. Indeed, the session's age is displayed instead of the stream's, which leads to great confusion when debugging, particularly when it comes to multiplexed inter-proxy connections which are kept up forever. Let's fix this now. This must be backported as far as 2.4. However, for 2.7 and older, the field was named tv_request and was a timeval.	2023-11-17 18:59:12 +01:00
Willy Tarreau	662565ddb4	MINOR: backend: without ->connect(), allow to pick another thread's connection If less connections than threads are established on a reverse-http gateway and these servers have a non-nul pool-min-conn, then conn_backend_get() will refrain from picking available connections from other threads. But this makes no sense for protocols for which there is no ->connect(), since there's no way the current thread will manage to establish its own connection. For such situations we should always accept to use another thread's connection. That's precisely what this patch does.	2023-11-17 18:13:04 +01:00
Willy Tarreau	f592a0d5dd	MINOR: rhttp: remove the unused outgoing connect() function A dummy connect() function previously had to be installed for the log server so that a reverse-http address could be referenced on a "server" line, but after the recent rework of the server line parsing, this is no longer needed, and this is actually annoying as it makes one believe there is a way to connect outside, which is not true. Let's now get rid of this function.	2023-11-17 18:10:16 +01:00
Willy Tarreau	d069825c5f	BUG/MEDIUM: mux-fcgi: fail earlier on malloc in takeover() This is the equivalent of the previous "BUG/MEDIUM: mux-h1: fail earlier on malloc in takeover()". Connection takeover was implemented for fcgi in 2.2 by commit a41bb0b6c ("MEDIUM: mux_fcgi: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling fcgi_release() will also result in touching the thread-local list of buffer waiters, calling unsubscribe(), There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. No tests were made to try to reproduce the problem, but the description above is sufficient to see that nothing can guarantee against it. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem!	2023-11-17 18:10:16 +01:00
Willy Tarreau	95fd2d6801	BUG/MEDIUM: mux-h1: fail earlier on malloc in takeover() This is the h1 equivalent of previous "BUG/MEDIUM: mux-h2: fail earlier on malloc in takeover()". Connection takeover was implemented for H1 in 2.2 by commit f12ca9f8f1 ("MEDIUM: mux_h1: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling h1_release() will call some unsubscribe and and possibly other things whose safety is not guaranteed (and the ambiguity here alone is sufficient to be careful). There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. Contrary to mux-h2, a few tests were not sufficient to try to crash the process, but there's nothing that indicates it couldn't happen based on the description above. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem!	2023-11-17 18:10:16 +01:00
Willy Tarreau	4f02e3da67	BUG/MEDIUM: mux-h2: fail earlier on malloc in takeover() Connection takeover was implemented for H2 in 2.2 by commit cd4159f03 ("MEDIUM: mux_h2: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling h2_release() will also result in h2_process() and h2_process_demux() that may process any possibly pending frames. Even the session remains the old one on the old thread, so that some sess_log() that are called when facing certain demux errors will be associated with the previous thread, possibly accessing a number of elements belonging to another thread. There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. The risk is extremely low, but Fred managed to reproduce use-after-free errors in conn_backend_get() after a takeover() failed by playing with -dMfail, indicating that h2_release() had been successfully called. In practise it's sufficient to have h2 on the server side with reuse-always and to inject lots of request on it with -dMfail. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem!	2023-11-17 18:10:16 +01:00
Willy Tarreau	c7a90cc181	CLEANUP: haproxy: remove old comment from 1.1 from the file header There was still a totally outdated comment speaking about issues affecting solaris on 1.1.8pre4 (April 2002, 21 year-old)! This proves that comments in headers are never read, so let's take this opportunity for also removing the outdated one recommending to read the "updated" RFC7230.	2023-11-17 18:10:16 +01:00
Fr�d�ric L�caille	888d1dc3dc	MINOR: quic: Rename "handshake" timeout to "client-hs" Use a more specific name for this timeout to distinguish it from a possible future one on the server side. Also update the documentation.	2023-11-17 18:09:41 +01:00
Frédéric Lécaille	373e40f0c1	MEDIUM: session: handshake timeout (TCP) Adapt session_accept_fd() called on accept() to set the handshake timeout from "hanshake-timeout" setting if set by configuration. If not set, continue to use the "client" timeout setting.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	392640a61b	BUG/MINOR: quic: Malformed CONNECTION_CLOSE frame This bug arrived with this commit: MINOR: quic: Avoid zeroing frame structures Before this latter, the CONNECTION_CLOSE was zeroed, especially the "reason phrase length". Restablish this behavior. No need to backport.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	953c7dc2b9	MINOR: quic: Dump the expiration date of the idle timer task This date is shared between the idle timer and hanshake timeout. So, it should be useful to dump the expiration date of the idle timer task itself, in place of the idle timer expiration date. This way, the handshake timeout value will be visible during the handshake from CLI "show quic full" command.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	e3e0bb90ce	MEDIUM: quic: Add support for "handshake" timeout setting. The idle timer task may be used to trigger the client handshake timeout. The hanshake timeout expiration date (qc->hs_expire) is initialized when the connection is allocated. Obviously, this timeout is taken into an account only during the handshake by qc_idle_timer_do_rearm() whose job is to rearm the idle timer. The idle timer expiration date could be initialized only one time, then never updated until the hanshake completes. But this only works if the handshake timeout is smaller than the idle timer task timeout. If the handshake timeout is set greater than the idle timeout, this latter may expire before the handshake timeout. This patch may have an impact on the L1/C1 interop tests (with heavy packet loss or corruption). This is why I guess some implementations with a hanshake timeout support set a big timeout during this test. This is at least the case for ngtcp2 which sets a 180s hanshake timeout! haproxy will certainly have to proceed the same way if it wants to have a chance to pass this test as before this handshake timeout.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	b33eacc523	MINOR: proxy: Add "handshake" new timeout (frontend side) Add a new timeout for the handshake, on the frontend side only. Such a hanshake will be typically used for TLS hanshakes during client connections to TLS/TCP or QUIC frontends.	2023-11-17 17:31:42 +01:00

... 11 12 13 14 15 ...

17264 Commits