mirror of
https://git.haproxy.org/git/haproxy.git/
synced 2025-09-22 06:11:32 +02:00
BUG/MINOR: sched: properly cover for a rare MT_LIST_ADDQ() race
In commit 3ef7a190b ("MEDIUM: tasks: apply a fair CPU distribution between tasklet classes") we compute a total weight to be used to split the CPU time between queues. There is a mention that the total cannot be null, wihch is based on the fact that we only get there if thread_has_task() returns non-zero. But there is a very small race which can break this assumption: if two threads conflict on MT_LIST_ADDQ() on an empty shared list and both roll back before trying again, there is the possibility that a first call to MT_LIST_ISEMPTY() sees the first thread install itself, then the second call will see the list empty when both roll back. Thus we could proceed with the queue while it's temporarily empty and compute max lengths using a divide by zero. This case is very hard to trigger, it seldom happens on 16 threads at 400k req/s. Let's simply test for max_total and leave the loop when we've not found any work. No backport is needed, that's 2.2-only.
This commit is contained in:
parent
9467f18d32
commit
1553b6657d
12
src/task.c
12
src/task.c
@ -514,11 +514,17 @@ void process_runnable_tasks()
|
|||||||
max[TL_BULK] = default_weights[TL_BULK];
|
max[TL_BULK] = default_weights[TL_BULK];
|
||||||
|
|
||||||
/* Now compute a fair share of the weights. Total may slightly exceed
|
/* Now compute a fair share of the weights. Total may slightly exceed
|
||||||
* 100% due to rounding, this is not a problem. Note that by design
|
* 100% due to rounding, this is not a problem. Note that while in
|
||||||
* the sum cannot be NULL as we cannot get there without tasklets to
|
* theory the sum cannot be NULL as we cannot get there without tasklets
|
||||||
* process.
|
* to process, in practice it seldom happens when multiple writers
|
||||||
|
* conflict and rollback on MT_LIST_ADDQ(shared_tasklet_list), causing
|
||||||
|
* a first MT_LIST_ISEMPTY() to succeed for thread_has_task() and the
|
||||||
|
* one above to finally fail. This is extremely rare and not a problem.
|
||||||
*/
|
*/
|
||||||
max_total = max[TL_URGENT] + max[TL_NORMAL] + max[TL_BULK];
|
max_total = max[TL_URGENT] + max[TL_NORMAL] + max[TL_BULK];
|
||||||
|
if (!max_total)
|
||||||
|
return;
|
||||||
|
|
||||||
for (queue = 0; queue < TL_CLASSES; queue++)
|
for (queue = 0; queue < TL_CLASSES; queue++)
|
||||||
max[queue] = ((unsigned)max_processed * max[queue] + max_total - 1) / max_total;
|
max[queue] = ((unsigned)max_processed * max[queue] + max_total - 1) / max_total;
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user