MEDIUM: resolvers: make the process_resolvers() task single-threaded

This task is sometimes caught triggering the watchdog while waiting for
the infamous resolvers lock, or the scheduler's wait queue lock in
task_queue(). Both are caused by its multi-threaded capability. The
task may indeed start on a thread that's different from the one that
is currently receiving a response and that holds the resolvers lock,
and when being queued back, it requires to lock the wait queue. Both
problems disappear when sticking it to a single thread. But for configs
running multiple resolvers sections, it would be suboptimal to run them
all on the same thread. In order to avoid this, we implement a counter
in the resolvers_finalize_config() section that rotates the thread for
each resolvers section.

This was sufficient to further improve the performance here, making the
CPU usage drop to about 7% (from 11 previously or 38 initially) and not
showing any resolvers lock contention anymore in perf top output.

The change was kept fairly minimal to permit a backport once enough
testing is conducted on it. It could address a significant part of
the trouble reported by Felipe in GH issue #3101.
This commit is contained in:
Willy Tarreau 2025-09-10 16:51:14 +02:00
parent d624aceaef
commit 2ce5e0edcc

View File

@ -2665,6 +2665,7 @@ static int resolvers_finalize_config(void)
const struct protocol *proto;
struct resolvers *resolvers;
struct proxy *px;
static int operating_thread = 0;
int err_code = 0;
enter_resolver_code();
@ -2703,12 +2704,17 @@ static int resolvers_finalize_config(void)
}
}
/* Create the task associated to the resolvers section */
if ((t = task_new_anywhere()) == NULL) {
/* Create the task associated to the resolvers section.
* We try to bind each resolvers section to a different thread
* in order to avoid expensive multi-threading tasks and make
* sure that the same thread deals with DNS I/O and scheduling.
*/
if ((t = task_new_on(operating_thread)) == NULL) {
ha_alert("resolvers '%s' : out of memory.\n", resolvers->id);
err_code |= (ERR_ALERT|ERR_ABORT);
goto err;
}
operating_thread = (operating_thread + 1) % global.nbthread;
/* Update task's parameters */
t->process = process_resolvers;