Go to file
Willy Tarreau 77f286e8bc BUG/MEDIUM: stick-tables: make sure never to create two same remote entries
In GH issue #2552, Christian Ruppert reported an increase in crashes
with recent 3.0-dev versions, always related with stick-tables and peers.
One particularity of his config is that it has a lot of peers.

While trying to reproduce, it empirically was found that firing 10 load
generators at 10 different haproxy instances tracking a random key among
100k against a table of max 5k entries, on 8 threads and between a total
of 50 parallel peers managed to reproduce the crashes in seconds, very
often in ebtree deletion or insertion code, but not only.

The debugging revealed that the crashes are often caused by a parent node
being corrupted while delete/insert tries to update it regarding a recently
inserted/removed node, and that that corrupted node had always been proven
to be deleted, then immediately freed, so it ought not be visited in the
tree from functions enclosed between a pair of lock/unlock. As such the
only possibility was that it had experienced unexpected inserts. Also,
running with pool integrity checking would 90% of the time cause crashes
during allocation based on corrupted contents in the node, likely because
it was found at two places in the same tree and still present as a parent
of a node being deleted or inserted (hence the __stksess_free and
stktable_trash_oldest callers being visible on these items).

Indeed the issue is in fact related to the test set (occasionally redundant
keys, many peers). What happens is that sometimes, a same key is learned
from two different peers. When it is learned for the first time, we end up
in stktable_touch_with_exp() in the "else" branch, where the test for
existence is made before taking the lock (since commit cfeca3a3a3
("MEDIUM: stick-table: touch updates under an upgradable read lock") that
was merged in 2.9), and from there the entry is added. But is one of the
threads manages to insert it before the other thread takes the lock, then
the second thread will try to insert this node again. And inserting an
already inserted node will corrupt the tree (note that we never switched
to enforcing a check in insertion code on this due to API history that
would break various code parts).

Here the solution is simple, it requires to recheck leaf_p after getting
the lock, to avoid touching anything if the entry has already been
inserted in the mean time.

Many thanks to Christian Ruppert for testing this and for his invaluable
help on this hard-to-trigger issue.

This fix needs to be backported to 2.9.
2024-05-24 11:52:11 +02:00
.github CI: drop asan.log umbrella completely 2024-05-13 11:36:36 +02:00
addons MINOR: stats: define stats-file output format support 2024-04-26 10:20:57 +02:00
admin BUILD: address a few remaining calloc(size, n) cases 2024-02-10 11:37:27 +01:00
dev DEV: flags/peers: Decode PEER and PEERS flags 2024-04-25 18:29:58 +02:00
doc MINOR: config: add thread-hard-limit to set an upper bound to nbthread 2024-05-24 09:46:49 +02:00
examples CLEANUP: assorted typo fixes in the code and comments 2023-11-23 16:23:14 +01:00
include MINOR: config: add thread-hard-limit to set an upper bound to nbthread 2024-05-24 09:46:49 +02:00
reg-tests MAJOR: spoe: Let the SPOE back into the game 2024-05-22 09:04:38 +02:00
scripts CI: scripts/buil-ssl: cleanup the boringssl and quictls build 2024-05-23 16:54:30 +02:00
src BUG/MEDIUM: stick-tables: make sure never to create two same remote entries 2024-05-24 11:52:11 +02:00
tests MINOR: ist: define iststrip() new function 2024-04-26 11:29:25 +02:00
.cirrus.yml CI: cirrus-ci: display gdb bt if any 2023-09-22 08:28:30 +02:00
.gitattributes MINOR: Configure the cpp userdiff driver for *.[ch] in .gitattributes 2021-02-22 18:17:57 +01:00
.gitignore CONTRIB: Add vi file extensions to .gitignore 2023-06-02 18:14:34 +02:00
.mailmap DOC: update Tim's address in .mailmap 2021-09-16 09:14:14 +02:00
.travis.yml CI: travis-ci: temporarily disable arm64 builds 2021-08-07 07:28:15 +02:00
BRANCHES DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
BSDmakefile BUILD: makefile: commit the tiny FreeBSD makefile stub 2023-05-24 17:17:36 +02:00
CHANGELOG [RELEASE] Released version 3.0-dev12 2024-05-18 16:51:23 +02:00
CONTRIBUTING CLEANUP: assorted typo fixes in the code and comments 2021-08-16 12:37:59 +02:00
INSTALL DOC: install: clarify the build process by splitting it into subsections 2024-04-11 18:02:26 +02:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS MAJOR: spoe: Let the SPOE back into the game 2024-05-22 09:04:38 +02:00
Makefile REORG: stats: define stats-proxy source module 2024-05-02 16:42:36 +02:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 3.0-dev12 2024-05-18 16:51:23 +02:00
VERSION [RELEASE] Released version 3.0-dev12 2024-05-18 16:51:23 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)