From 80ff10c81d7e4e752f59764bfda206f5043dd764 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Thu, 5 Jan 2023 18:06:58 +0100 Subject: [PATCH] BUG/MINOR: fd: avoid bad tgid assertion in fd_delete() from deinit() In 2.7, commit 0dc1cc93b ("MAJOR: fd: grab the tgid before manipulating running") added a check to make sure we never try to delete an FD from the wrong thread group. It already handles the specific case of an isolated thread (e.g. stop a listener from the CLI) but forgot to take into account the deinit() code iterating over all idle server connections to close them. This results in the crash below during deinit() if thread groups are enabled and idle connections exist on a thread group higher than 1. [WARNING] (15711) : Proxy decrypt stopped (cumulated conns: FE: 64, BE: 374511). [WARNING] (15711) : Proxy stats stopped (cumulated conns: FE: 0, BE: 0). [WARNING] (15711) : Proxy GLOBAL stopped (cumulated conns: FE: 0, BE: 0). FATAL: bug condition "fd_tgid(fd) != ti->tgid && !thread_isolated()" matched at src/fd.c:369 call trace(11): | 0x4a6060 [c6 04 25 01 00 00 00 00]: main-0x1d60 | 0x67fcc6 [c7 43 68 fd ad de fd 5b]: sock_conn_ctrl_close+0x16/0x1f | 0x59e6f5 [48 89 ef e8 83 65 11 00]: main+0xf6935 | 0x60ad16 [48 8b 1b 48 81 fb a0 91]: free_proxy+0x716/0xb35 | 0x62750e [48 85 db 74 35 48 89 dd]: deinit+0xbe/0x87a | 0x627ce2 [89 ef e8 97 76 e7 ff 0f]: deinit_and_exit+0x12/0x19 | 0x4a9694 [bf e6 ff 9d 00 44 89 6c]: main+0x18d4/0x2c1a There's no harm though since all traffic already ended. This must be backported to 2.7. --- src/fd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/fd.c b/src/fd.c index f4f1bae81..7f1077cd8 100644 --- a/src/fd.c +++ b/src/fd.c @@ -366,7 +366,7 @@ void fd_delete(int fd) /* the tgid cannot change before a complete close so we should never * face the situation where we try to close an fd that was reassigned. */ - BUG_ON(fd_tgid(fd) != ti->tgid && !thread_isolated()); + BUG_ON(fd_tgid(fd) != ti->tgid && !thread_isolated() && !(global.mode & MODE_STOPPING)); /* we must postpone removal of an FD that may currently be in use * by another thread. This can happen in the following two situations: