BUG/MAJOR: sched: protect task->expire on 32-bit platforms

Commit 7d40b3134 ("MEDIUM: sched: do not run a same task multiple times
in series") required to slightly reorder a few fields in struct tasklet
and task in order to reuse an existing hole and keep tree nodes aligned.

The problem is that nice+expire were placed in struct task just before
rq, and that a 48-bit hole replaces them in struct tasklet on 64-bit
platforms, just before the struct list. However, on 32-bit platforms,
the hole is only 16-bit and preserves nice, but expire is overwritten
by the first pointer of the list element. This is not a problem for
real tasklets which do not use these fields, but it definitely is a
problem for tasks that are cast to tasklets in the run queues, because
the expire field can be overwritten when the task is woken up, and if
requeued as-is, it will expire at a completely random date.

This is what caused certain regtests to fail on i386 and 32-bit arm
machines.

This fix needs to be backported wherever the patch above was backported.
The bug has no effect on 64-bit platforms. The fix doesn't inflate
structs on 64-bit, but will raise struct tasklet from 40 to 44 bytes on
32-bit platforms.

Thanks to William for spotting the problem, bisecting it and providing
a working reproducer.
This commit is contained in:
Willy Tarreau 2026-04-21 23:21:21 +02:00
parent 2f61566b03
commit c1c2a2bfe0

View File

@ -128,6 +128,12 @@ struct notification {
* pointer if the task/tasklet remains valid, and return NULL if it has been
* deleted. The scheduler relies on this to know if it should update its state
* on return.
*
* Keep in mind that tasks can be cast to tasklets while in the final tasklet
* queues, and will be listed via the tasklet's <list> instead of the task's
* <rq>. However other fields (wq, common) must remain totally valid for the
* task during this time. This explains why certain elements are present in
* the common part even though pure tasklets do not need them.
*/
#define TASK_COMMON \
unsigned int state; /* task state : bitfield of TASK_ */ \
@ -138,14 +144,14 @@ struct notification {
uint32_t wake_date; /* date of the last task wakeup */ \
unsigned int calls; /* number of times process was called */ \
TASK_DEBUG_STORAGE; \
short last_run; /* 16-bit now_ms of last run */
/* a 16- or 48-bit hole remains here and is used by task */
short last_run; /* 16-bit now_ms of last run */ \
short nice; /* task prio from -1024 to +1024 */ \
int expire; /* next expiration date for this task, in ticks */
/* total: 36 or 48 bytes on 32/64 bit platforms */
/* The base for all tasks */
struct task {
TASK_COMMON; /* must be at the beginning! */
short nice; /* task prio from -1024 to +1024 */
int expire; /* next expiration date for this task, in ticks */
struct eb32_node rq; /* ebtree node used to hold the task in the run queue */
/* WARNING: the struct task is often aliased as a struct tasklet when
* it is NOT in the run queue. The tasklet has its struct list here
@ -158,7 +164,6 @@ struct task {
/* lightweight tasks, without priority, mainly used for I/Os */
struct tasklet {
TASK_COMMON; /* must be at the beginning! */
/* 48-bit hole here */
struct list list;
/* WARNING: the struct task is often aliased as a struct tasklet when
* it is not in the run queue. The task has its struct rq here where