14970 Commits

Author SHA1 Message Date
Ilya Shipitsin
6f86eaae4f CLEANUP: assorted typo fixes in the code and comments
This is 33rd iteration of typo fixes
2022-11-30 14:02:36 +01:00
Willy Tarreau
4ede46be4e BUG/MINOR: peers: always update the stksess shard number on incoming updates
If shards are in use, we must fill the shard number on incoming updates,
otherwise some entries are assigned shard number zero, and may be broadcast
everywhere once updated, instead of being sent only to the peers having the
same shard number.

This fixes commit 36d156564 ("MINOR: peers: Support for peer shards"). No
backport is needed.
2022-11-29 18:06:42 +01:00
Willy Tarreau
b12be7c1bb CLEANUP: peers: factor out the key len calculation in received updates
In peer_treat_updatemsg(), the lower layers of the stick-table code are
reimplemented, and the key length is never really known for an entry
being processed, it depends on the type being parsed and the moment
where it's done. This makes it quite difficult to stuff some shard
number calculation there.

This patch adds a keylen local variable that is always set to the length
of the current key depending on its type. It takes this opportunity for
reducing redudant expressions involving this length and always using the
new variable instead, limiting the risk of errors. Arguably that code
would have been way simpler by creating a dummy stktable_key and passing
it to stksess_new() as done anywhere else, but let's not change all that
a few days before the release.
2022-11-29 18:06:42 +01:00
Willy Tarreau
d5cae6a0c7 MINOR: stick-table: change the API of the function used to calculate the shard
The function used to calculate the shard number currently requires a
stktable_key on input for this. Unfortunately, it happens that peers
currently miss this calculation and they do not provide stktable_key
at all, instead they're open-coding all the low-level stick-table work
(hence why it's missing). Thus we'll need to be able to calculate the
shard number in keys coming from peers as well but the current API does
not make it possible.

This commit addresses this by inverting the order where the length and
the shard number are used. Now the low-level function is independent on
stksess and stktable_key, it takes a table, pointer and length and does
all the job. The upper function takes care of the type and key to get
the its length, and is for use only from stick-table code.

This doesn't change anything except that the low-level one will be usable
from outside (hence why it's exported now).
2022-11-29 18:06:42 +01:00
Christopher Faulet
061a098c5c BUG/MEDIUM: mux-h1: Close client H1C on EOS when there is no output data
If the client closes the connection while there is no pending outgoing data,
the H1 connection must be released. However, it was switched to CLOSING
state instead. Thus the client connection was closed on client timeout.

It is side effect of the commif d1b573059a ("MINOR: mux-h1: Avoid useless
call to h1_send() if no error is sent"). Before, the extra call to h1_send()
was able to fix the H1C state.

To fix the bug and make switch to close state (CLOSING or CLOSED) less
errorprone, h1_close() helper function is systematically used.

It is a 2.7-specific bug. No backport needed.
2022-11-29 17:25:02 +01:00
Willy Tarreau
e548a7af45 BUG/MINOR: peers: always initialize the stksess shard value
We need to initialize the shard value in __stksess_init() because there is
not necessarily a key to make it happen later, resulting in an uninitialized
shard value appearing in the entry, typically when entries are learned from
peers. This fixes commit 36d156564 ("MINOR: peers: Support for peer shards"),
no backport is needed.

Note however that it is not sufficient to completely fix the peers code, the
shard value remains zero because the setting of the key was open-coded in
the peers code and these parts were not identified when adding support for
shards.
2022-11-29 16:33:37 +01:00
Willy Tarreau
f8c7709013 MINOR: mux-h2: add the expire task and its expiration date in "show fd"
Some issues such as #1929 seem to involve a task without timeout but we
can't find the condition to reproduce this in the code. However, not having
this info in the output doesn't help, so this patch adds the task pointer
and its timeout (when the task is non-null). It may be useful to backport
it.
2022-11-29 15:29:00 +01:00
Frédéric Lécaille
7b5d9b1f03 BUG/MINOR: quic: Endless loop during retransmissions
qc_dgrams_retransmit() could reuse the same local list and could splice it two
times to the packet number space list of frame to be send/resend. This creates a
loop in this list and makes qc_build_frms() possibly endlessly loop when trying
to build frames from the packet number space list of frames. Then haproxy aborts.

This issue could be easily reproduced patching qc_build_frms() function to set <dlen>
variable value to 0 after having built at least 10 CRYPTO frames and using ngtcp2
as client with 30% packet loss in both direction.

Thank you to @gabrieltz for having reported this issue in GH #1903.

Must be backported to 2.6.
2022-11-29 15:19:16 +01:00
Amaury Denoyelle
2526a6aca5 CLEANUP: ncbuf: use standard BUG_ON with DEBUG_STRICT
ncbuf can be compiled for haproxy or standalone to run unit test suite.
For the latest mode, BUG_ON() macro has been re-implemented in a simple
version.

The inclusion of the default or the redefined macro relied on DEBUG_DEV.
Change this to now rely on DEBUG_STRICT as this is activated for the
default build.

This change is safe as only BUG_ON_HOT() macro is used in ncbuf code,
which is activated only with the default value DEBUG_STRICT=2.

This should be backported up to 2.6.
2022-11-29 15:15:27 +01:00
Amaury Denoyelle
d64a26f023 CLEANUP: ncbuf: inline small functions
ncbuf API relies on lot of small functions. Mark these functions as
inline to reduce call invocations and facilitate compiler optimizations
to reduce code size.

This should be backported up to 2.6.
2022-11-29 15:14:39 +01:00
Amaury Denoyelle
17e20e8cef CLEANUP: ncbuf: remove ncb_blk args by value
ncb_blk structure is used to represent a block of data or a gap in a
non-contiguous buffer. This is used in several functions for ncbuf
implementation. Before this patch, ncb_blk was passed by value, which is
sub-optimal. Replace this by const pointer arguments.

This has the side-effect of suppressing a compiler warning reported in
older GCC version :
  CC      src/http_conv.o
  src/ncbuf.c: In function 'ncb_blk_next':
  src/ncbuf.c:170: warning: 'blk.end' may be used uninitialized in this function

This should be backported up to 2.6.
2022-11-29 15:12:54 +01:00
Willy Tarreau
16b282f4b0 MINOR: stick-table: show the shard number in each entry's "show table" output
Stick-tables support sharding to multiple peers but there was no way to
know to what shard an entry was going to be sent. Let's display this in
the "show table" output to ease debugging.
2022-11-29 12:00:49 +01:00
Willy Tarreau
56460ee52a MINOR: stick-table: store a per-table hash seed and use it
Instead of using memcpy() to concatenate the table's name to the key when
allocating an stksess, let's compute once for all a per-table seed at boot
time and use it to calculate the key's hash. This saves two memcpy() and
the usage of a chunk, it's always nice in a fast path.

When tested under extreme conditions with a 80-byte long table name, it
showed a 1% performance increase.
2022-11-28 18:58:06 +01:00
Willy Tarreau
f9607f8b1f REORG: activity/cli: move the "show activity" handler to activity.c
Initially the code was placed into cli.c to keep activity.c small and
independent of the cli stuff, but that's no longer the case anyway and
keeping that code over there makes it harder to find. Let's move it to
its more natural place now.
2022-11-25 15:41:47 +01:00
Willy Tarreau
04b5b266e5 MINOR: activity: report uptime in "show activity"
It happened a few times that it was difficult to figure if a counter was
normal or not in "show activity" based on the uptime. Let's just emit the
uptime value along with the date.
2022-11-25 15:36:48 +01:00
William Lallemand
0a2d63236c BUG/MINOR: ssl: shut the ca-file errors emitted during httpclient init
With an OpenSSL library which use the wrong OPENSSLDIR, HAProxy tries to
load the OPENSSLDIR/certs/ into @system-ca, but emits a warning when it
can't.

This patch fixes the issue by allowing to shut the error when the SSL
configuration for the httpclient is not explicit.

Must be backported in 2.6.
2022-11-24 19:14:19 +01:00
William Lallemand
3992f55ff3 MINOR: ssl: forgotten newline in error messages on ca-file
Add forgotten newlines in ssl_store_load_ca_from_buf() error messages.
2022-11-24 18:45:28 +01:00
Amaury Denoyelle
9875f024ba BUG/MEDIUM: quic: fix datagram dropping on queueing failed
After reading a datagram, it is enqueud for the thread attached to the
DCID. This is done via quic_lstnr_dgram_dispatch() function. If this
step fails, we remove the datagram from the buffer of quic_receiver_buf.

This step is faulty because we use b_del() instead of b_sub(). If
quic_receiver_buf was not empty, we will remove content from another
datagram while leaving the content of the last read datagram. This
probably produces valid datagram dropping and may even result in crash.

As stated, this bug can only happen if qc_lstnr_dgram_dispatch() fails
which happen on two occaions :
* on quic_dgram allocation failure, which should be pretty rare
* on datagram labelled as invalid for QUIC protocol. This may happen
  more frequently depending on the network conditions. Thus, this bug
  has been labelled as a medium one.

This should be backported up to 2.6.
2022-11-24 16:45:02 +01:00
Willy Tarreau
6c72fa3d18 CLEANUP: qpack: properly use the QPACK macros not HPACK ones in debug code
There were a few leftovers of DEBUG_HPACK and HPACK_SHT_SIZE instead of
their QPACK equivalent in the QPACK debug code. There's no harm anyway,
but it could lead to confusing results if the tables are not sized
equally.
2022-11-24 15:38:26 +01:00
Willy Tarreau
f87bb23acb CLEANUP: qpack: fix format string in debugging code (int signedness)
In issue #1939, Ilya mentions that cppchecks warned about use of "%d" to
report the QPACK table's index that's locally stored as an unsigned int.
While technically valid, this will never cause any trouble since indexes
are always small positive values, but better use %u anyway to silence
this warning.
2022-11-24 15:35:17 +01:00
Willy Tarreau
d05aa38950 CLEANUP: peers: fix format string for status messages (int signedness)
In issue #1939, Ilya mentions that cppchecks warned about use of "%d" to
report the status state that's locally stored as an unsigned int. While
technically valid, this will never cause any trouble since in the end
what we store there are the applet's states (just a few enum values).
Better use %u anyway to silence this warning.
2022-11-24 15:32:20 +01:00
Aurelien DARRAGON
a7dc251e07 MINOR: auth: silence null dereference warning in check_user()
In GH issue #1940 cppcheck warns about a possible null-dereference in
check_user() when DEBUG_AUTH is enabled. Indeed, <ep> may potentially
be NULL because upon error crypt_r() and crypt() may return a null
pointer. However it's not directly derefenced, it is only passed to
printf() with '%s' fmt. While it is in practice fine with the printf
implementations we care about (that check strings against null before
printing them), it is undefined behavior according to the spec, hence
the warning.

Let's check <ep> before passing it to printf. This should partly
solve GH #1940.
2022-11-24 15:24:02 +01:00
Willy Tarreau
95f40c698d MINOR: sample: make the rand() sample fetch function use the statistical_prng
Emeric noticed that producing many randoms to fill a stick table was
saturating on the rand_lock. Since 2.4 we have the statistical PRNG
for low-quality randoms like this one, there is no point in using the
one that was originally implemented for the purpose of creating safe
UUIDs, since the doc itself clearly states that these randoms are not
secure and they have not been in the past either. With this change,
locking contention is completely gone.
2022-11-24 15:04:13 +01:00
Uriah Pollock
3cbf09ed64 MEDIUM: ssl: add minimal WolfSSL support with OpenSSL compatibility mode
This adds a USE_OPENSSL_WOLFSSL option, wolfSSL must be used with the
OpenSSL compatibility layer. This must be used with USE_OPENSSL=1.

WolfSSL build options:

   ./configure --prefix=/opt/wolfssl --enable-haproxy

HAProxy build options:

  USE_OPENSSL=1 USE_OPENSSL_WOLFSSL=1 WOLFSSL_INC=/opt/wolfssl/include/ WOLFSSL_LIB=/opt/wolfssl/lib/ ADDLIB='-Wl,-rpath=/opt/wolfssl/lib'

Using at least the commit 54466b6 ("Merge pull request #5810 from
Uriah-wolfSSL/haproxy-integration") from WolfSSL. (2022-11-23).

This is still to be improved, reg-tests are not supported yet, and more
tests are to be done.

Signed-off-by: William Lallemand <wlallemand@haproxy.org>
2022-11-24 11:29:03 +01:00
Willy Tarreau
33a6870fea BUILD: quic: silence two invalid build warnings at -O1 with gcc-6.5
Gcc 6.5 is now well known for triggering plenty of false "may be used
uninitialized", particularly at -O1, and two of them happen in quic,
quic_tp and quic_conn. Both of them were reviewed and easily confirmed
as wrong (gcc seems to ignore the control flow after the function
returns and believes error conditions are not met). Let's just preset
the variables that bothers it. In quic_tp the initialization was moved
out of the loop since there's no point inflating the code just to
silence a stupid warning.
2022-11-24 09:16:41 +01:00
Willy Tarreau
08093cc0fa CLEANUP: tools: do not needlessly include xxhash nor cli from tools.h
These includes brought by commit 9c76637ff ("MINOR: anon: add new macros
and functions to anonymize contents") resulted in an increase of exactly
20% of the number of lines to build. These include are not needed there,
only tools.c needs xxhash.h.
2022-11-24 08:30:48 +01:00
Willy Tarreau
07a3d539f5 BUILD: quic: global.h is needed in cfgparse-quic
cfgparse-quic accesses some members of the "global" struct but only
includes global-t.h. It actually used to work via tools.h due to a
long dependency chain that brought it, but it will be fixed and will
break cfgparse-quic, so let's fix it first.
2022-11-24 08:30:48 +01:00
Willy Tarreau
a4728584ff BUILD: stick-tables: fix build breakage in xxhash on older compilers
Commit 36d156564 ("MINOR: peers: Support for peer shards") reintroduced
a direct dependency on import/xxhash.h which was previously dropped by
commit d5fc8fcb8 ("CLEANUP: Add haproxy/xxhash.h to avoid modifying
import/xxhash.h"). This results in xxhash.h being included twice, which
breaks the build on older compilers which do not like seeing XXH32_hash_t
being defined twice.

Let's just use include/haproxy/xxhash.h instead.

No backport is needed.
2022-11-24 07:38:13 +01:00
Christopher Faulet
d1b573059a MINOR: mux-h1: Avoid useless call to h1_send() if no error is sent
If we choose to not send an error to the client, there is no reason to call
h1_send() for nothing. This happens because functions handling errors return
1 instead of 0 when nothing is performed.
2022-11-23 17:13:13 +01:00
Christopher Faulet
a1a76ce709 MINOR: mux-h1: Remove H1C_F_WAIT_NEXT_REQ in functions handling errors
If is cleaner to remove this flag in the internal functions handling errors,
responsible to update the H1 connection state, instead to do so in calling
functions. This will hopefully avoid bugs in future.
2022-11-23 17:07:49 +01:00
Christopher Faulet
227424450c BUG/MINOR: mux-h1: Fix handling of 408-Request-Time-Out
When a timeout is detected waiting for the request, a 408-Request-Time-Out
response is sent. However, an error was introduced by commit 6858d76cd3
("BUG/MINOR: mux-h1: Obey dontlognull option for empty requests"). Instead
of inhibiting the log message, the option was stopping the error sending.

Of course, we must do the opposite.

This patch must be backported as far as 2.4.
2022-11-23 16:58:23 +01:00
Christopher Faulet
4427ea7f04 BUG/MEDIUM: mux-h1: Remove H1C_F_WAIT_NEXT_REQ flag on a next request
When an idle H1 connection starts to process an new request, we must take
care to remove H1C_F_WAIT_NEXT_REQ flag. This flag is used to know an idle
H1 connection has already processed at least one request and is waiting for
a next one, but nothing was received yet.

Keeping this flag leads to a crash because some running H1 connections may
be erroneously released on a soft-stop. Indeed, only idle or closed
connections must be released.

This bug was reported into #1903. It is 2.7-specific. No backport needed.
2022-11-23 15:59:00 +01:00
Christopher Faulet
881cce9f13 BUILD: ssl-sock: Silent error about NULL deref in ssl_sock_bind_verifycbk()
In ssl_sock_bind_verifycbk(), when compiled without QUIC support, the
compiler may report an error during compilation about a possible NULL
dereference:

src/ssl_sock.c: In function ‘ssl_sock_bind_verifycbk’:
src/ssl_sock.c:1738:12: error: potential null pointer dereference [-Werror=null-dereference]
 1738 |         ctx->xprt_st |= SSL_SOCK_ST_FL_VERIFY_DONE;
      |         ~~~^~~~~~~~~

A BUG_ON() was addeded because it must never happen. But when compiled
without DEBUG_STRICT, there is nothing to help the compiler. Thus
ALREADY_CHECKED() macro is used. The ssl-sock context and the bind config
are concerned.

This patch must be backported to 2.6.
2022-11-23 09:27:14 +01:00
Christopher Faulet
92c2de1a06 BUILD: http-htx: Silent build error about a possible NULL start-line
In http_replace_req_uri(), if the URI was successfully replaced, it means
the start-line exists. However, the compiler reports an error about a
possible NULL pointer dereference:

src/http_htx.c: In function ‘http_replace_req_uri’:
src/http_htx.c:392:19: error: potential null pointer dereference [-Werror=null-dereference]
  392 |         sl->flags &= ~HTX_SL_F_NORMALIZED_URI;

So, ALREADY_CHECKED() macro is used to silent the build error. This patch
must be backported with 84cdbe478a ("BUG/MINOR: http-htx: Don't consider an
URI as normalized after a set-uri action").
2022-11-22 18:02:02 +01:00
Christopher Faulet
a462ee0af4 BUG/MEDIUM: mux-h1: Subscribe for reads on error on sending path
The recent refactoring about errors handling in the H1 multiplexer
introduced a bug on abort when the client wait for the server response. The
bug only exists if abortonclose option is not enabled. Indeed, in this case,
when the end of the message is reached, the mux stops to receive data
because these data are part of the next request. However, error on the
sending path are no longer fatal. An error on the reading path must be
caught to do so. So, in case of a client abort, the error is not reported to
the upper layer and the H1 connection cannot be closed if some pending data
are blocked (remember, an error on sending path was detected, blocking
outgoing data).

To be sure to have a chance to detect the abort in the case, when an error
is detected on the sending path, we force the subscription for reads.

This patch, with the previous one, should fix the issue #1943. It is
2.7-specific, no backport is needed.
2022-11-22 17:49:10 +01:00
Christopher Faulet
f75cc5468a BUG/MEDIUM: mux-h1: Don't release H1C on timeout if there is a SC attached
When the H1 task timed out, we must be careful to not release the H1
conneciton if there is still a H1 stream with a stream-connector
attached. In this case, we must wait. There are some tests to prevent it
happens. But the last one only tests the UPGRADING state while there is also
the CLOSING state with a SC attached. But, in the end, it is safer to test
if there is a H1 stream with a SC attached.

This patch should partially fix the issue #1943. However, it only prevent
the segfault. There is another bug under the hood. BTW, this one is
2.7-specific. Not backport is needed.
2022-11-22 17:49:10 +01:00
Christopher Faulet
84cdbe478a BUG/MINOR: http-htx: Don't consider an URI as normalized after a set-uri action
An abosulte URI is marked as normalized if it comes from an H2 client. This
way, we know we can send a relative URI to an H1 server. But, after a
set-uri action, the URI must no longer be considered as normalized.
Otherwise there is no way to send an absolute URI on the server side.

If it is important to update a normalized absolute URI without altering this
property, the host, path and/or query-string must be set separatly.

This patch should fix the issue #1938. It should be backported as far as
2.4.
2022-11-22 17:49:10 +01:00
Christopher Faulet
e16ffb0383 BUG/MINOR: h1: Replace authority validation to conform RFC3986
Except for CONNECT method, where a normalization is performed, we expected
to have an exact match between the authority and the host header value.
However it was too strict. Indeed, default port must be handled and the
matching must respect the RFC3986.

There is already a scheme based normalizeation performed on the URI later,
on the HTX message. And we cannot normalize the URI during H1 parsing to be
able to report proper errors on the original raw buffer. And a systematic
read-only normalization to validate the authority will consume CPU for only
few requests. At the end, we decided to perform extra-checks when the exact
match fails. Now, following authority/host are considered as equivalent:

  http:  domain.com <=> domain.com:80  <=> domain.com:
  https: domain.com <=> domain.com:443 <=> domain.com:

This patch depends on:

  * MINOR: h1: Consider empty port as invalid in authority for CONNECT
  * MINOR: http: Considere empty ports as valid default ports

It is a bug regarding the RFC3986. Technically, I may be backported as far
as 2.4. However, this must be discussed first. If it is backported, the
commits above must be backported too.
2022-11-22 17:49:10 +01:00
Christopher Faulet
e5dfe1169d BUG/MINOR: http-htx: Normalized absolute URIs with an empty port
Thanks to the previous commit ("MINOR: http: Considere empty ports as valid
default ports"), empty ports are now considered as valid default ports. Thus,
absolute URIs with empty port should be normalized.

So now, the following URIs are normalized:

 http://example.com:/  --> http://example.com/
 https://example.com:/ --> https://example.com/

This patch depend on:

   * MINOR: h1: Consider empty port as invalid in authority for CONNECT
   * MINOR: http: Considere empty ports as valid default ports

It is a bug regarding the RFC3986. Technically, I may be backported as far
as 2.4. However, this must be discussed first. If backported, the commits
above must be backported too.
2022-11-22 16:56:49 +01:00
Christopher Faulet
99ade9e0da MINOR: http: Considere empty ports as valid default ports
In RFC3986#6.2.3, following URIs are considered as equivalent:

      http://example.com
      http://example.com/
      http://example.com:/
      http://example.com:80/

The third one is interristing because the port is empty and it is still
considered as a default port. Thus, http_get_host_port() does no longer
return IST_NULL when the port is empty. Now, a ist is returned, it points on
the first character after the colon (':') with a length of 0. In addition,
http_is_default_port() now considers an empty port as a default port,
regardless the scheme.

This patch must not be backported, if so, without the previous one ("MINOR:
h1: Consider empty port as invalid in authority for CONNECT").
2022-11-22 16:56:49 +01:00
Christopher Faulet
75348c2e8b MINOR: h1: Consider empty port as invalid in authority for CONNECT
For now, this change is useless because http_get_host_port() returns
IST_NULL when the port is empty. But this will change. For other methods,
empty ports are valid. But not for CONNECT method. To still return a
400-Bad-Request if a CONNECT is performed with an empty port, istlen() is
used to test the port, instead of isttest().
2022-11-22 16:27:52 +01:00
Aurelien DARRAGON
e3177af465 CLEANUP: tools: extra check in utoa_pad
Removing useless check in utoa_pad().

This was reported by Ilya with the help of cppcheck.
2022-11-22 16:27:52 +01:00
Aurelien DARRAGON
ac1ca5cc7b CLEANUP: arg: remove extra check in make_arg_list arg escaping
Len cannot be equal to 1 when entering in escape handling code.
But yet, an extra "len == 1" check was performed.

Removing this useless check.

This was reported by Ilya with the help of cppcheck.
2022-11-22 16:27:52 +01:00
Aurelien DARRAGON
ab9efc25f0 BUG/MINOR: log: fix parse_log_message rfc5424 size check
In parse_log_message(), if log is rfc5424 compliant, p pointer
is incremented and size is not. However size is still used in further
checks as if p pointer was not incremented.

This could lead to logic error or buffer overflow if input buf is not
null-terminated.

Fixing this by making sure size is up to date where it is needed.

It could be backported up to 2.4.
2022-11-22 16:27:52 +01:00
Aurelien DARRAGON
9dce88ba2c BUG/MINOR: cfgparse-listen: fix ebpt_next_dup pointer dereference on proxy "from" inheritance
ebpt_next_dup() was used 2 times in a row but only the first call was
checked against NULL, probably assuming that the 2 calls always yield the
same result here.

gcc is not OK with that, and it should be safer to store the result of
the first call in a temporary var to dereference it once checked against NULL.

This should fix GH #1869.
Thanks to Ilya for reporting this issue.

It may be backported up to 2.4.
2022-11-22 16:27:52 +01:00
Willy Tarreau
5ec79f1a04 BUILD: sched: fix build with DEBUG_THREAD with the previous commit
The build with DEBUG_THREAD was broken by commit fc50b9dd1 ("BUG/MAJOR:
sched: protect task during removal from wait queue"). It took me a while
to figure how to declare and aligned and initialized rwlock that wasn't
static, but it turns out that __decl_aligned_rwlock() does exactly this,
so that we don't have to assign an integer value when a struct is expected
in case of debugging.

No backport is needed.
2022-11-22 10:24:07 +01:00
Willy Tarreau
fc50b9dd14 BUG/MAJOR: sched: protect task during removal from wait queue
The issue addressed by commit fbb934da9 ("BUG/MEDIUM: stick-table: fix
a race condition when updating the expiration task") is still present
when thread groups are enabled, but this time it lies in the scheduler.

What happens is that a task configured to run anywhere might already
have been queued into one group's wait queue. When updating a stick
table entry, sometimes the task will have to be dequeued and requeued.

For this a lock is taken on the current thread group's wait queue lock,
but while this is necessary for the queuing, it's not sufficient for
dequeuing since another thread might be in the process of expiring this
task under its own group's lock which is different. This is easy to test
using 3 stick tables with 1ms expiration, 3 track-sc rules and 4 thread
groups. The process crashes almost instantly under heavy traffic.

One approach could consist in storing the group number the task was
queued under in its descriptor (we don't need 32 bits to store the
thread id, it's possible to use one short for the tid and another
one for the tgrp). Sadly, no safe way to do this was figured, because
the race remains at the moment the thread group number is checked, as
it might be in the process of being changed by another thread. It seems
that a working approach could consist in always having it associated
with one group, and only allowing to change it under this group's lock,
so that any code trying to change it would have to iterately read it
and lock its group until the value matches, confirming it really holds
the correct lock. But this seems a bit complicated, particularly with
wait_expired_tasks() which already uses upgradable locks to switch from
read state to a write state.

Given that the shared tasks are not that common (stick-table expirations,
rate-limited listeners, maybe resolvers), it doesn't seem worth the extra
complexity for now. This patch takes a simpler and safer approach
consisting in switching back to a single wq_lock, but still keeping
separate wait queues. Given that shared wait queues are almost always
empty and that otherwise they're scanned under a read lock, the
contention remains manageable and most of the time the lock doesn't
even need to be taken since such tasks are not present in a group's
queue. In essence, this patch reverts half of the aforementionned
patch. This was tested and confirmed to work fine, without observing
any performance degradation under any workload. The performance with
8 groups on an EPYC 74F3 and 3 tables remains twice the one of a
single group, with the contention remaining on the table's lock first.

No backport is needed.
2022-11-22 09:10:08 +01:00
Willy Tarreau
469fa47950 BUILD: listener: fix build warning on global_listener_rwlock without threads
The global_listener_rwlock was introduced by recent commit 13e86d947
("BUG/MEDIUM: listener: Fix race condition when updating the global mngmt
task"), but it's declared static and is not used when threads are disabled,
thus causing a warning to be emitted in this case. Let's just condition it
to thread usage to shut the warning.

This will need to be backported where the patch above is backported.
2022-11-22 09:10:08 +01:00
Willy Tarreau
c21a187ec0 MINOR: server/idle: make the next_takeover index per-tgroup
In order to evenly pick idle connections from other threads, there is
a "next_takeover" index in the server, that is incremented each time
a connection is picked from another thread, and indicates which one to
start from next time.

With thread groups this doesn't work well because the index is the same
regardless of the group, and if a group has more threads than another,
there's even a risk to reintroduce an imbalance.

This patch introduces a new per-tgroup storage in servers which, for now,
only contains an instance of this next_takeover index. This way each
thread will now only manipulate the index specific to its own group, and
the takeover will become fair again. More entries may come soon.
2022-11-21 19:21:07 +01:00
Willy Tarreau
9dc231a6b2 BUG/MINOR: server/idle: at least use atomic stores when updating max_used_conns
In 2.2, some idle conns usage metrics were added by commit cf612a045
("MINOR: servers: Add a counter for the number of currently used
connections."), which mentioned that the operation doesn't need to be
atomic since we're not seeking exact values. This is true but at least
we should use atomic stores to make sure not to cause invalid values
to appear on archs that wouldn't guarantee atomicity when writing an
int, such as writing two 16-bit words. This is pretty unlikely on our
targets but better keep the code safe against this.

This may be backported as far as 2.2.
2022-11-21 19:21:07 +01:00