Go to file
Willy Tarreau 821a04377d BUG/MEDIUM: muxes: enforce buf_wait check in takeover()
The ->takeover() is quite tricky. It didn't take care of the possibility
that the original thread's connection handler had been woken up to handle
an event (e.g. read0), failed to get a buffer, registered against its own
thread's buffer_wait queue and left the connection in an idle state.

A new thread could then come by, perform a takeover(), and when a buffer
was available, the new thread's tasklet would be woken up by the old one
via *_buf_available(), causing all sort of problems. These problems are
easy to reproduce, by running with shared backend connections and few
buffers (tune.buffers.limit=20, 8 threads, 500 connections, transfer
64kB objects and wait 2-5s for a crash to appear).

A first estimated solution consisted in removing the connection from the
idle list but it turns out that it would be worse for the delete stuff
(the connection no longer appearing as idle, making it impossible to find
it in order to close it). Also, idle counts wouldn't match anymore the
list's state, and the special case of private connections could be
difficult to handle as the connection could be forcefully re-added to the
idle list after allocation despite being private.

After multiple attempts to address the problem in various ways, it appears
that the only reliable solution for now (without starting to turn many
lists to mt_lists) is to have the takeover() function handle the buf_wait
detection or unregistration itself:

  - when doing a regular takeover aiming at finding an idle connection
    for a new request, connections that are blocked in a buffer_wait
    queue are quite rare and not interesting at all (since not immediately
    usable), so skipping them is sufficient. For this we detect that the
    desired connection belongs to a buffer_wait list by checking its
    buf_wait.list element. Note that this check is *not* thread-safe! The
    LIST_DEL_INIT() is performed by __offer_buffers() after the callback
    was called. But this is sufficient as it is now because the only way
    for the element to be seen as not in a list is after the element was
    last touched by __offer_buffers(), so the situation for this connection
    will not change in a different way later.

  - when doing a server delete, we're running under thread isolation.
    The connection might get taken over to be killed. The only trick is
    that private connections not belonging to any idle list may also
    experience this, and in this case even the idle_conns lock will not
    offer any protection against anything. But since we're run under
    thread isolation, we're certain not to compete with the other thread,
    so it's safe to directly unregister the connection from its owner
    thread. Normally this is already handled by conn_release() in
    cli_parse_delete_server(), which calls mux->destroy(), but this would
    actually update the current thread's queue instead of the origin
    thread's, thus we do need to perform an explicit dequeue before
    completing the takeover.

With this, the problem now looks solved for HTTP/1, HTTP/2 and FCGI,
though extensive tests were essentially run on HTTP/1 and HTTP/2.

While the problem has been there for a very long time, there should be
no reason to backport it since buffer_wait didn't practically work
before 3.0-dev and the process used to freeze hard very quickly before
we'd even have a chance to meet that race.
2024-05-15 19:37:12 +02:00
.github CI: drop asan.log umbrella completely 2024-05-13 11:36:36 +02:00
addons MINOR: stats: define stats-file output format support 2024-04-26 10:20:57 +02:00
admin BUILD: address a few remaining calloc(size, n) cases 2024-02-10 11:37:27 +01:00
dev DEV: flags/peers: Decode PEER and PEERS flags 2024-04-25 18:29:58 +02:00
doc MEDIUM: hlua: take nbthread into account in hlua_get_nb_instruction() 2024-05-15 11:59:44 +02:00
examples CLEANUP: assorted typo fixes in the code and comments 2023-11-23 16:23:14 +01:00
include MINOR: dynbuf: provide a b_dequeue() variant for multi-thread 2024-05-15 19:37:12 +02:00
reg-tests REGTESTS: ssl: be more verbose with ocsp_compat_check.vtc 2024-05-15 10:36:02 +02:00
scripts SCRIPTS: run-regtests: fix a few occurrences of extended regexes 2024-05-15 19:33:45 +02:00
src BUG/MEDIUM: muxes: enforce buf_wait check in takeover() 2024-05-15 19:37:12 +02:00
tests MINOR: ist: define iststrip() new function 2024-04-26 11:29:25 +02:00
.cirrus.yml CI: cirrus-ci: display gdb bt if any 2023-09-22 08:28:30 +02:00
.gitattributes MINOR: Configure the cpp userdiff driver for *.[ch] in .gitattributes 2021-02-22 18:17:57 +01:00
.gitignore CONTRIB: Add vi file extensions to .gitignore 2023-06-02 18:14:34 +02:00
.mailmap DOC: update Tim's address in .mailmap 2021-09-16 09:14:14 +02:00
.travis.yml CI: travis-ci: temporarily disable arm64 builds 2021-08-07 07:28:15 +02:00
BRANCHES DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
BSDmakefile BUILD: makefile: commit the tiny FreeBSD makefile stub 2023-05-24 17:17:36 +02:00
CHANGELOG [RELEASE] Released version 3.0-dev11 2024-05-10 17:39:19 +02:00
CONTRIBUTING CLEANUP: assorted typo fixes in the code and comments 2021-08-16 12:37:59 +02:00
INSTALL DOC: install: clarify the build process by splitting it into subsections 2024-04-11 18:02:26 +02:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS MAJOR: spoe: Deprecate the SPOE filter 2024-03-15 11:29:39 +01:00
Makefile REORG: stats: define stats-proxy source module 2024-05-02 16:42:36 +02:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 3.0-dev11 2024-05-10 17:39:19 +02:00
VERSION [RELEASE] Released version 3.0-dev11 2024-05-10 17:39:19 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)