haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-11-24 04:11:02 +01:00

Go to file

Willy Tarreau fc50b9dd14 BUG/MAJOR: sched: protect task during removal from wait queue

The issue addressed by commit fbb934da9 ("BUG/MEDIUM: stick-table: fix
a race condition when updating the expiration task") is still present
when thread groups are enabled, but this time it lies in the scheduler.

What happens is that a task configured to run anywhere might already
have been queued into one group's wait queue. When updating a stick
table entry, sometimes the task will have to be dequeued and requeued.

For this a lock is taken on the current thread group's wait queue lock,
but while this is necessary for the queuing, it's not sufficient for
dequeuing since another thread might be in the process of expiring this
task under its own group's lock which is different. This is easy to test
using 3 stick tables with 1ms expiration, 3 track-sc rules and 4 thread
groups. The process crashes almost instantly under heavy traffic.

One approach could consist in storing the group number the task was
queued under in its descriptor (we don't need 32 bits to store the
thread id, it's possible to use one short for the tid and another
one for the tgrp). Sadly, no safe way to do this was figured, because
the race remains at the moment the thread group number is checked, as
it might be in the process of being changed by another thread. It seems
that a working approach could consist in always having it associated
with one group, and only allowing to change it under this group's lock,
so that any code trying to change it would have to iterately read it
and lock its group until the value matches, confirming it really holds
the correct lock. But this seems a bit complicated, particularly with
wait_expired_tasks() which already uses upgradable locks to switch from
read state to a write state.

Given that the shared tasks are not that common (stick-table expirations,
rate-limited listeners, maybe resolvers), it doesn't seem worth the extra
complexity for now. This patch takes a simpler and safer approach
consisting in switching back to a single wq_lock, but still keeping
separate wait queues. Given that shared wait queues are almost always
empty and that otherwise they're scanned under a read lock, the
contention remains manageable and most of the time the lock doesn't
even need to be taken since such tasks are not present in a group's
queue. In essence, this patch reverts half of the aforementionned
patch. This was tested and confirmed to work fine, without observing
any performance degradation under any workload. The performance with
8 groups on an EPYC 74F3 and 3 tables remains twice the one of a
single group, with the contention remaining on the table's lock first.

No backport is needed.

2022-11-22 09:10:08 +01:00

.github

CI: emit the compiler's version in the build reports

2022-11-14 11:14:02 +01:00

addons

BUILD: prometheus: use __fallthrough in promex_dump_metrics() and IO handler()

2022-11-14 11:14:02 +01:00

admin

CLEANUP: assorted typo fixes in the code and comments

2022-10-30 17:17:56 +01:00

dev

DEV: poll: indicate the FD's side in front of its value

2022-11-17 10:56:35 +01:00

doc

MINOR: global: generate random cluster.secret if not defined

2022-11-21 16:41:34 +01:00

examples

EXAMPLES: remove completely outdated acl-content-sw.cfg

2022-05-30 18:14:24 +02:00

include

BUG/MAJOR: sched: protect task during removal from wait queue

2022-11-22 09:10:08 +01:00

reg-tests

REG-TESTS: cache: Remove T-E header for 304-Not-Modified responses

2022-11-16 17:19:43 +01:00

scripts

CLEANUP: assorted typo fixes in the code and comments

2022-10-30 17:17:56 +01:00

src

BUG/MAJOR: sched: protect task during removal from wait queue

2022-11-22 09:10:08 +01:00

tests

TESTS: add a unit test for one_among_mask()

2022-06-21 20:29:57 +02:00

.cirrus.yml

CI: cirrus-ci: bump FreeBSD image to 13-1

2022-09-09 13:30:17 +02:00

.gitattributes

MINOR: Configure the cpp userdiff driver for *.[ch] in .gitattributes

2021-02-22 18:17:57 +01:00

.gitignore

CLEANUP: exclude udp-perturb with .gitignore

2022-09-16 15:47:04 +02:00

.mailmap

DOC: update Tim's address in .mailmap

2021-09-16 09:14:14 +02:00

.travis.yml

CI: travis-ci: temporarily disable arm64 builds

2021-08-07 07:28:15 +02:00

BRANCHES

DOC: fix some spelling issues over multiple files

2021-01-08 14:53:47 +01:00

CHANGELOG

[RELEASE] Released version 2.7-dev9

2022-11-18 17:48:49 +01:00

CONTRIBUTING

CLEANUP: assorted typo fixes in the code and comments

2021-08-16 12:37:59 +02:00

INSTALL

BUILD: Makefile: Add Lua 5.4 autodetect

2022-07-04 17:28:48 +02:00

LICENSE

LICENSE: add licence exception for OpenSSL

2012-09-07 13:52:26 +02:00

MAINTAINERS

DOC: add maintainers for QUIC and HTTP/3

2022-05-30 17:34:51 +02:00

Makefile

BUILD: Makefile: enable USE_SHM_OPEN by default on freebsd

2022-11-18 15:24:23 +01:00

README

DOC: create a BRANCHES file to explain the life cycle

2019-06-15 22:00:14 +02:00

SUBVERS

BUILD: use format tags in VERDATE and SUBVERS files

2013-12-10 11:22:49 +01:00

VERDATE

[RELEASE] Released version 2.7-dev9

2022-11-18 17:48:49 +01:00

VERSION

[RELEASE] Released version 2.7-dev9

2022-11-18 17:48:49 +01:00

README

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)

Languages

C 98%

Shell 0.9%

Makefile 0.5%

Lua 0.2%

Python 0.2%