7136 Commits

Author SHA1 Message Date
Willy Tarreau
c618ed5ff4 MAJOR: import: update mt_list to support exponential back-off
The new mt_list code supports exponential back-off on conflict, which
is important for use cases where there is contention on a large number
of threads. The API evolved a little bit and required some updates:

  - mt_list_for_each_entry_safe() is now in upper case to explicitly
    show that it is a macro, and only uses the back element, doesn't
    require a secondary pointer for deletes anymore.

  - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has
    to set the list iterator to NULL so that it is not re-inserted
    into the list and the list is spliced there. One must be careful
    because it was usually performed before freeing the element. Now
    instead the element must be nulled before the continue/break.

  - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been
    unclear. They were replaced by mt_list_cut_around() and
    mt_list_connect_elem() which more explicitly detach the element
    and reconnect it into the list.

  - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is
    in list.h. It may however possibly benefit from being upstreamed.

This required tiny adaptations to event_hdl.c and quic_sock.c. The
test case was updated and the API doc added. Note that in order to
keep include files small, the struct mt_list definition remains in
list-t.h (par of the internal API) and was ifdef'd out in mt_list.h.

A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with
80 threads shows a drastic reduction of CPU usage thanks to this and
the refined memory barriers. Please note that the CPU usage on OpenSSL
3.0.9 is significantly higher due to the excessive use of atomic ops
by openssl, but 3.1 is only slightly above 1.1.1 though:

  - before: 35 Gbps, 3.5 Mpps, 7800% CPU
  - after:  41 Gbps, 4.2 Mpps, 2900% CPU
2023-09-13 11:50:33 +02:00
Frédéric Lécaille
84757e32e6 BUG/MEDIUM: quic: quic_cc_conn ->cntrs counters unreachable
This bug arrived with this commit in 2.9-dev3:

    MEDIUM: quic: Allow the quic_conn memory to be asap released.

When sending packets from quic_cc_conn_io_cb(), e.g. when the quic_conn
object has been released and replaced by a lighter one (quic_cc_conn),
some counters may have to be incremented. But they were not reachable
because not shared between quic_conn and quic_cc_conn struct.

To fix this, one has only to move the ->cntrs counters from quic_conn
to QUIC_CONN_COMMON struct which is shared between both quic_cc_conn

Thank you to Tristan for having reported this in GH #2247.

No need to backport.
2023-09-12 18:13:36 +02:00
Willy Tarreau
efc46dede9 DEBUG: pools: inspect pools on fatal error and dump information found
It's a bit frustrating sometimes to see pool checks catch a bug but not
provide exploitable information without a core.

Here we're adding a function "pool_inspect_item()" which is called just
before aborting in pool_check_pattern() and POOL_DEBUG_CHECK_MARK() and
which will display the error type, the pool's pointer and name, and will
try to check if the item's tag matches the pool, and if not, will iterate
over all pools to see if one would be a better candidate, then will try
to figure the last known caller and possibly other likely candidates if
the pool's tag is not sufficiently trusted. This typically helps better
diagnose corruption in use-after-free scenarios, or freeing to a pool
that differs from the one the object was allocated from, and will also
indicate calling points that may help figure where an object was last
released or allocated. The info is printed on stderr just before the
backtrace.

For example, the recent off-by-one test in the PPv2 changes would have
produced the following output in vtest logs:

  ***  h1    debug|FATAL: pool inconsistency detected in thread 1: tag mismatch on free().
  ***  h1    debug|  caller: 0x62bb87 (conn_free+0x147/0x3c5)
  ***  h1    debug|  pool: 0x2211ec0 ('pp_tlv_256', size 304, real 320, users 1)
  ***  h1    debug|Tag does not match. Possible origin pool(s):
  ***  h1    debug|  tag: @0x2565530 = 0x2216740 (pp_tlv_128, size 176, real 192, users 1)
  ***  h1    debug|Recorded caller if pool 'pp_tlv_128':
  ***  h1    debug|  @0x2565538 (+0184) = 0x62c76d (conn_recv_proxy+0x4cd/0xa24)

A mismatch in the allocated/released pool is already visible, and the
callers confirm it once resolved, where the allocator indeed allocates
from pp_tlv_128 and conn_free() releases to pp_tlv_256:

  $ addr2line -spafe ./haproxy <<< $'0x62bb87\n0x62c76d'
  0x000000000062bb87: conn_free at connection.c:568
  0x000000000062c76d: conn_recv_proxy at connection.c:1177
2023-09-11 15:46:14 +02:00
Willy Tarreau
f6bee5a50b DEBUG: pools: make pool_check_pattern() take a pointer to the pool
This will be useful to report detailed bug traces.
2023-09-11 15:19:49 +02:00
Willy Tarreau
e92e96b00f DEBUG: pools: pass the caller pointer to the check functions and macros
In preparation for more detailed pool error reports, let's pass the
caller pointers to the check functions. This will be useful to produce
messages indicating where the issue happened.
2023-09-11 15:19:49 +02:00
Willy Tarreau
baf2070421 DEBUG: pools: always record the caller for uncached allocs as well
When recording the caller of a pool_alloc(), we currently store it only
when the object comes from the cache and never when it comes from the
heap. There's no valid reason for this except that the caller's pointer
was not passed to pool_alloc_nocache(), so it used to set NULL there.
Let's just pass it down the chain.
2023-09-11 15:19:49 +02:00
Willy Tarreau
4a18d9e560 REORG: cpuset: move parse_cpu_set() and parse_cpumap() to cpuset.c
These ones were still in cfgparse.c but they're not specific to the
config at all and may actually be used even when parsing cpu list
entries in /sys. Better move them where they can be reused.
2023-09-08 16:25:19 +02:00
Willy Tarreau
5119109e3f MINOR: cpuset: dynamically allocate cpu_map
cpu_map is 8.2kB/entry and there's one such entry per group, that's
~520kB total. In addition, the init code is still in haproxy.c enclosed
in ifdefs. Let's make this a dynamically allocated array in the cpuset
code and remove that init code.

Later we may even consider reallocating it once the number of threads
and groups is known, in order to shrink it a little bit, as the typical
setup with a single group will only need 8.2kB, thus saving half a MB
of RAM. This would require that the upper bound is placed in a variable
though.
2023-09-08 16:25:19 +02:00
Willy Tarreau
1f2433fb6a MINOR: tools: add function read_line_to_trash() to read a line of a file
This function takes on input a printf format for the file name, making
it particularly suitable for /proc or /sys entries which take a lot of
numbers. It also automatically trims the trailing CR and/or LF chars.
2023-09-08 16:25:19 +02:00
Frédéric Lécaille
e3e218b98e CLEANUP: quic: Remove useless free_quic_tx_pkts() function.
This function define but no more used since this commit:
    BUG/MAJOR: quic: Really ignore malformed ACK frames.
2023-09-08 10:17:25 +02:00
Frédéric Lécaille
292dfdd78d BUG/MINOR: quic: Wrong cluster secret initialization
The function generate_random_cluster_secret() which initializes the cluster secret
when not supplied by configuration is buggy. There 1/256 that the cluster secret
string is empty.

To fix this, one stores the cluster as a reduced size first 128 bits of its own
SHA1 (160 bits) digest, if defined by configuration. If this is not the case, it
is initialized with a 128 bits random value. Furthermore, thus the cluster secret
is always initialized.

As the cluster secret is always initialized, there are several tests which
are for now on useless. This patch removes such tests (if(global.cluster_secret))
in the QUIC code part and at parsing time: no need to check that a cluster
secret was initialized with "quic-force-retry" option.

Must be backported as far as 2.6.
2023-09-08 09:50:58 +02:00
William Lallemand
15e591b6e0 MINOR: ssl: add support for 'curves' keyword on server lines
This patch implements the 'curves' keyword on server lines as well as
the 'ssl-default-server-curves' keyword in the global section.

It also add the keyword on the server line in the ssl_curves reg-test.

These keywords allow the configuration of the curves list for a server.
2023-09-07 23:29:10 +02:00
Willy Tarreau
28ff1a5d56 MINOR: tasks/stats: report the number of niced tasks in "show info"
We currently know the number of tasks in the run queue that are niced,
and we don't expose it. It's too bad because it can give a hint about
what share of the load is relevant. For example if one runs a Lua
script that was purposely reniced, or if a stats page or the CLI is
hammered with slow operations, seeing them appear there can help
identify what part of the load is not caused by the traffic, and
improve monitoring systems or autoscalers.
2023-09-06 17:44:44 +02:00
Remi Tricot-Le Breton
e03d060aa3 MINOR: cache: Change hash function in default normalizer used in case of "vary"
When building the secondary signature for cache entries when vary is
enabled, the referer part of the signature was a simple crc32 of the
first referer header.
This patch changes it to a 64bits hash based of xxhash algorithm with a
random seed built during init. This will prevent "malicious" hash
collisions between entries of the cache.
2023-09-06 16:11:31 +02:00
Aurelien DARRAGON
d9b81e5b49 MEDIUM: log/sink: make logsrv postparsing more generic
We previously had postparsing logic but only for logsrv sinks, but now we
need to make this operation on logsrv directly instead of sinks to prepare
for additional postparsing logic that is not sink-specific.

To do this, we migrated post_sink_resolve() and sink_postresolve_logsrvs()
to their postresolve_logsrvs() and postresolve_logsrv_list() equivalents.

Then, we split postresolve_logsrv_list() so that the sink-only logic stays
in sink.c (sink_resolve_logsrv_buffer() function), and the "generic"
target part stays in log.c as resolve_logsrv().

Error messages formatting was preserved as far as possible but some slight
variations are to be expected.
As for the functional aspect, no change should be expected.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
969e212c66 MINOR: log: add dup_logsrv() helper function
ease code maintenance by introducing dup_logsrv() helper function to
properly duplicate an existing logsrv struct.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
d499485aa9 MINOR: sink: simplify post_sink_resolve function
Simplify post_sink_resolve() function to reduce code duplication and
make it easier to maintain.
2023-09-06 16:06:39 +02:00
Aurelien DARRAGON
5b295ff409 MINOR: ring: add a function to compute max ring payload
Add a helper function to the ring API to compute the maximum payload
length that could fit into the ring based on ring size.
2023-09-06 16:06:39 +02:00
Christopher Faulet
3ec156f027 BUG/MEDIUM: applet: Fix API for function to push new data in channels buffer
All applets only check the -1 error value (need room) for applet_put*
functions while the underlying functions may also return -2 if the input is
closed or -3 if the data length is invalid. It means applets already handle
other cases by their own.

The API should be fixed but for now, to ease backports, we only fix
applet_put* functions to always return -1 on error. This way, at least for
the applets point of view, the API is consistent.

This patch should be backported to 2.8. Probably not further. Except if we
suspect it could fix a bug.
2023-09-06 09:29:27 +02:00
Frédéric Lécaille
fb4294be55 BUG/MINOR: quic: Wrong RTT computation (srtt and rrt_var)
Due to the fact that several variable values (rtt_var, srtt) were stored as multiple
of their real values, some calculations were less accurate as expected.

Stop storing 4*rtt_var values, and 8*srtt values.
Adjust all the impacted statements.

Must be backported as far as 2.6.
2023-09-05 17:14:51 +02:00
William Lallemand
d90d3bf894 MINOR: global: export the display_version() symbol
Export the display_version() function which can be used elsewhere than
in haproxy.c
2023-09-05 15:24:39 +02:00
Willy Tarreau
86854dd032 MEDIUM: threads: detect excessive thread counts vs cpu-map
This detects when there are more threads bound via cpu-map than CPUs
enabled in cpu-map, or when there are more total threads than the total
number of CPUs available at boot (for unbound threads) and configured
for bound threads. In this case, a warning is emitted to explain the
problems it will cause, and explaining how to address the situation.

Note that some configurations will not be detected as faulty because
the algorithmic complexity to resolve all arrangements grows in O(N!).
This means that having 3 threads on 2 CPUs and one thread on 2 CPUs
will not be detected as it's 4 threads for 4 CPUs. But at least configs
such as T0:(1,4) T1:(1,4) T2:(2,4) T3:(3,4) will not trigger a warning
since they're valid.
2023-09-04 19:39:17 +02:00
Willy Tarreau
8357f950cb MEDIUM: threads: detect incomplete CPU bindings
It's very easy to mess up with some cpu-map directives and to leave
some thread unbound. Let's add a test that checks that either all
threads are bound or none are bound, but that we do not face the
intermediary situation where some are pinned and others are left
wandering around, possibly on the same CPUs as bound ones.

Note that this should not be backported, or maybe turned into a
notice only, as it appears that it will easily catch invalid
configs and that may break updates for some users.
2023-09-04 19:39:17 +02:00
Willy Tarreau
e65f54cf96 MINOR: cpuset: centralize a reliable bound cpu detection
Till now the CPUs that were bound were only retrieved in
thread_cpus_enabled() in order to count the number of CPUs allowed,
and it relied on arch-specific code.

Let's slightly arrange this into ha_cpuset_detect_bound() that
reuses the ha_cpuset struct and the accompanying code. This makes
the code much clearer without having to carry along some arch-specific
stuff out of this area.

Note that the macos-specific code used in thread.c to only count
online CPUs but not retrieve a mask, so for now we can't infer
anything from it and can't implement it.

In addition and more importantly, this function is reliable in that
it will only return a value when the detection is accurate, and will
not return incomplete sets on operating systems where we don't have
an exact list, such as online CPUs.
2023-09-04 19:39:17 +02:00
Willy Tarreau
d3ecc67a01 MINOR: cpuset: add ha_cpuset_or() to bitwise-OR two CPU sets
This operation was not implemented and will be needed later.
2023-09-04 19:39:17 +02:00
Willy Tarreau
eb10567254 MINOR: cpuset: add ha_cpuset_isset() to check for the presence of a CPU in a set
This function will be convenient to test for the presence of a given CPU
in a set.
2023-09-04 19:39:17 +02:00
Willy Tarreau
17a7baca07 BUILD: bug: make BUG_ON() void to avoid a rare warning
When building without threads, the recently introduced BUG_ON(tid != 0)
turns to a constant expression that evaluates to 0 and that is not used,
resulting in this warning:

  src/connection.c: In function 'conn_free':
  src/connection.c:584:3: warning: statement with no effect [-Wunused-value]

This is because the whole thing is declared as an expression for clarity.
Make it return void to avoid this. No backport is needed.
2023-09-04 19:38:51 +02:00
Andrew Hopkins
b3f94f8b3b BUILD: ssl: Build with new cryptographic library AWS-LC
This adds a new option for the Makefile USE_OPENSSL_AWSLC, and
update the documentation with instructions to use HAProxy with
AWS-LC.

Update the type of the OCSP callback retrieved with
SSL_CTX_get_tlsext_status_cb with the actual type for
libcrypto versions greater than 1.0.2. This doesn't affect
OpenSSL which casts the callback to void* in SSL_CTX_ctrl.
2023-09-04 18:19:18 +02:00
Christopher Faulet
b50a471adb BUG/MEDIUM: stconn: Don't block sends if there is a pending shutdown
For the same reason than the previous patch, we must not block the sends
when there is a pending shutdown. In other words, we must consider the sends
are allowed when there is a pending shutdown.

This patch must slowly be backported as far as 2.2. It should partially fix
issue #2249.
2023-09-01 14:18:26 +02:00
Willy Tarreau
844a3bc25b MEDIUM: checks: implement a queue in order to limit concurrent checks
The progressive adoption of OpenSSL 3 and its abysmal handshake
performance has started to reveal situations where it simply isn't
possible anymore to succesfully run health checks on many servers,
because between the moment all the checks are started and the moment
the handshake finally completes, the timeout has expired!

This also has consequences on production traffic which gets
significantly delayed as well, all that for lots of checks. While it's
possible to increase the check delays, it doesn't solve everything as
checks still take a huge amount of time to converge in such conditions.

Here we take a different approach by permitting to enforce the maximum
concurrent checks per thread limitation and implementing an ordered
queue. Thanks to this, if a thread about to start a check has reached
its limit, it will add the check at the end of a queue and it will be
processed once another check is finished. This proves to be extremely
efficient, with all checks completing in a reasonable amount of time
and not being disturbed by the rest of the traffic from other checks.
They're just cycling slower, but at the speed the machine can handle.

One must understand however that if some complex checks perform multiple
exchanges, they will take a check slot for all the required duration.
This is why the limit is not enforced by default.

Tests on SSL show that a limit of 5-50 checks per thread on local
servers gives excellent results already, so that could be a good starting
point.
2023-09-01 14:00:04 +02:00
Willy Tarreau
cfc0bceeb5 MEDIUM: checks: search more aggressively for another thread on overload
When the current check is overloaded (more running checks than the
configured limit), we'll try more aggressively to find another thread.
Instead of just opportunistically looking for one half as loaded, now if
the current thread has more than 1% more active checks than another one,
or has more than a configured limit of concurrent running checks, it will
search for a more suitable thread among 3 other random ones in order to
migrate the check there. The number of migrations remains very low (~1%)
and the checks load very fair across all threads (~1% as well). The new
parameter is called tune.max-checks-per-thread.
2023-09-01 08:26:06 +02:00
Willy Tarreau
00de9e0804 MINOR: checks: maintain counters of active checks per thread
Let's keep two check counters per thread:
  - one for "active" checks, i.e. checks that are no more sleeping
    and are assigned to the thread. These include sleeping and
    running checks ;

  - one for "running" checks, i.e. those which are currently
    executing on the thread.

By doing so, we'll be able to spread the health checks load a bit better
and refrain from sending too many at once per thread. The counters are
atomic since a migration increments the target thread's active counter.
These numbers are reported in "show activity", which allows to check
per thread and globally how many checks are currently pending and running
on the system.

Ideally, we should only consider checks in the process of establishing
a connection since that's really the expensive part (particularly with
OpenSSL 3.0). But the inner layers are really not suitable to doing
this. However knowing the number of active checks is already a good
enough hint.
2023-09-01 08:26:06 +02:00
Willy Tarreau
3b7942a1c9 MINOR: check/activity: collect some per-thread check activity stats
We now count the number of times a check was started on each thread
and the number of times a check was adopted. This helps understand
better what is observed regarding checks.
2023-09-01 08:26:06 +02:00
Willy Tarreau
e03d05c6ce MINOR: check: remember when we migrate a check
The goal here is to explicitly mark that a check was migrated so that
we don't do it again. This will allow us to perform other actions on
the target thread while still knowing that we don't want to be migrated
again. The new READY bit combine with SLEEPING to form 4 possible states:

 SLP  RDY   State      Description
  0    0    -          (reserved)
  0    1    RUNNING    Check is bound to current thread and running
  1    0    SLEEPING   Check is sleeping, not bound to a thread
  1    1    MIGRATING  Check is migrating to another thread

Thus we set READY upon migration, and check for it before migrating, this
is sufficient to prevent a second migration. To make things a bit clearer,
the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are
adjacent.
2023-09-01 08:26:06 +02:00
Willy Tarreau
7163f95b43 MINOR: checks: start the checks in sleeping state
The CHK_ST_SLEEPING state was introduced by commit d114f4a68 ("MEDIUM:
checks: spread the checks load over random threads") to indicate that
a check was not currently bound to a thread and that it could easily
be migrated to any other thread. However it did not start the checks
in this state, meaning that they were not redispatchable on startup.

Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0)
the cost of setting up new connections is so high that some threads may
experience connection timeouts on startup. In this case it's better if
they can transfer their excess load to other idle threads. By just
marking the check as sleeping upon startup, we can do this and
significantly reduce the number of failed initial checks.
2023-09-01 08:26:06 +02:00
Willy Tarreau
52b260bae4 MINOR: server/ssl: maintain an index of the last known valid SSL session
When a thread creates a new session for a server, if none was known yet,
we assign the thread id (hence the reused_sess index) to a shared variable
so that other threads will later be able to find it when they don't have
one yet. For now we only set and clear the pointer upon session creation,
we do not yet pick it.

Note that we could have done it per thread-group, so as to avoid any
cross-thread exchanges, but it's anticipated that this is essentially
used during startup, at a moment where the cost of inter-thread contention
is very low compared to the ability to restart at full speed, which
explains why instead we store a single entry.
2023-08-31 08:50:01 +02:00
Willy Tarreau
607041dec3 MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session
The goal will be to permit a thread to update its session while having
it shared with other threads. For now we only place the lock and arrange
the code around it so that this is quite light. For now only the owner
thread uses this lock so there is no contention.

Note that there is a subtlety in the openssl API regarding
i2s_SSL_SESSION() in that it fills the area pointed to by its argument
with a dump of the session and returns a size that's equal to the
previously allocated one. As such, it does modify the shared area even
if that's not obvious at first glance.
2023-08-31 08:50:01 +02:00
Alexander Stephan
ece0d1ab49 MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch
We already have a call that can retreive an TLV with any value.
Therefore, the fetch logic is redundant and can be simplified
by simply calling the generic fetch with the correct TLV ID
set as an argument.
2023-08-29 15:31:51 +02:00
Alexander Stephan
fecc573da1 MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs
In order to be able to implement fetches in the future that allow
retrieval of any TLVs, a new generic data structure for TLVs is introduced.

Existing TLV fetches for PP2_TYPE_AUTHORITY and PP2_TYPE_UNIQUE_ID are
migrated to use this new data structure. TLV related pools are updated
to not rely on type, but only on size. Pools accomodate the TLV list
element with their associated value. For now, two pools for 128 B and
256 B values are introduced. More fine-grained solutions are possible
in the future, if necessary.
2023-08-29 15:15:47 +02:00
Alexander Stephan
c9d47652d2 CLEANUP/MINOR: connection: Improve consistency of PPv2 related constants
This patch improves readability by scoping HA proxy related PPv2 constants
with a 'HA" prefix. Besides, a new constant for the length of a CRC32C
TLV is introduced. The length is derived from the PPv2 spec, so 32 Bit.
2023-08-29 15:15:47 +02:00
Willy Tarreau
bd84387beb MEDIUM: capabilities: enable support for Linux capabilities
For a while there has been the constraint of having to run as root for
transparent proxying, and we're starting to see some cases where QUIC is
not running in socket-per-connection mode due to the missing capability
that would be needed to bind a privileged port. It's not realistic to
ask all QUIC users on port 443 to run as root, so instead let's provide
a basic support for capabilities at least on linux. The ones currently
supported are cap_net_raw, cap_net_admin and cap_net_bind_service. The
mechanism was made OS-specific with a dedicated file because it really
is. It can be easily refined later for other OSes if needed.

A new keyword "setcaps" is added to the global section, to enumerate the
capabilities that must be kept when switching from root to non-root. This
is ignored in other situations though. HAProxy has to be built with
USE_LINUX_CAP=1 for this to be supported, which is enabled by default
for linux-glibc, linux-glibc-legacy and linux-musl.

A good way to test this is to start haproxy with such a config:

    global
        uid 1000
        setcap cap_net_bind_service

    frontend test
        mode http
        timeout client 3s
        bind quic4@:443 ssl crt rsa+dh2048.pem allow-0rtt

and run it under "sudo strace -e trace=bind,setuid", then connecting
there from an H3 client. The bind() syscall must succeed despite the
user id having been switched.
2023-08-29 11:11:50 +02:00
William Lallemand
e7d9082315 BUG/MINOR: ssl/cli: can't find ".crt" files when replacing a certificate
Bug was introduced by commit 26654 ("MINOR: ssl: add "crt" in the
cert_exts array").

When looking for a .crt directly in the cert_exts array, the
ssl_sock_load_pem_into_ckch() function will be called with a argument
which does not have its ".crt" extensions anymore.

If "ssl-load-extra-del-ext" is used this is not a problem since we try
to add the ".crt" when doing the lookup in the tree.

However when using directly a ".crt" without this option it will failed
looking for the file in the tree.

The fix removes the "crt" entry from the array since it does not seem to
be really useful without a rework of all the lookups.

Should fix issue #2265

Must be backported as far as 2.6.
2023-08-28 18:20:39 +02:00
Willy Tarreau
892d04733f BUILD: import: guard plock.h against multiple inclusion
Surprisingly there's no include guard in plock.h though there is one in
atomic-ops.h. Let's add one, or we cannot risk including the file multiple
times.
2023-08-26 17:28:08 +02:00
Amaury Denoyelle
5afcb686b9 MAJOR: connection: purge idle conn by last usage
Backend idle connections are purged on a recurring occurence during the
process lifetime. An estimated number of needed connections is
calculated and the excess is removed periodically.

Before this patch, purge was done directly using the idle then the safe
connection tree of a server instance. This has a major drawback to take
no account of a specific ordre and it may removed functional connections
while leaving ones which will fail on the next reuse.

The problem can be worse when using criteria to differentiate idle
connections such as the SSL SNI. In this case, purge may remove
connections with a high rate of reusing while leaving connections with
criteria never matched once, thus reducing drastically the reuse rate.

To improve this, introduce an alternative storage for idle connection
used in parallel of the idle/safe trees. Now, each connection inserted
in one of this tree is also inserted in the new list at
`srv_per_thread.idle_conn_list`. This guarantees that recently used
connection is present at the end of the list.

During the purge, use this list instead of idle/safe trees. Remove first
connection in front of the list which were not reused recently. This
will ensure that connection that are frequently reused are not purged
and should increase the reuse rate, particularily if distinct idle
connection criterias are in used.
2023-08-25 15:57:48 +02:00
Amaury Denoyelle
61fc9568fb MINOR: server: move idle tree insert in a dedicated function
Define a new function _srv_add_idle(). This is a simple wrapper to
insert a connection in the server idle tree. This is reserved for simple
usage and require to idle_conns lock. In most cases,
srv_add_to_idle_list() should be used.

This patch does not have any functional change. However, it will help
with the next patch as idle connection will be always inserted in a list
as secondary storage along with idle/safe trees.
2023-08-25 15:57:48 +02:00
Amaury Denoyelle
77ac8eb4a6 MINOR: connection: simplify removal of idle conns from their trees
Small change of API for conn_delete_from_tree(). Now the connection
instance is taken as argument instead of its inner node.

No functional change introduced with this commit. This simplifies
slightly invocation of conn_delete_from_tree(). The most useful changes
is that this function will be extended in the next patch to be able to
remove the connection from its new idle list at the same time as in its
idle tree.
2023-08-25 15:57:48 +02:00
Frédéric Lécaille
81815a9a83 MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock.
Replace ->lock type of pat_ref struct by HA_RWLOCK_T.
Replace all calls to HA_SPIN_LOCK() (resp. HA_SPIN_UNLOCK()) by HA_RWLOCK_WRLOCK()
(resp. HA_RWLOCK_WRUNLOCK()) when a write access is required.
There is only one read access which is needed. This is in the "show map" command
callback, cli_io_handler_map_lookup() where a HA_SPIN_LOCK() call is replaced
by HA_RWLOCK_RDLOCK() (resp. HA_SPIN_UNLOCK() by HA_RWLOCK_RDUNLOCK).
Replace HA_SPIN_INIT() calls by HA_RWLOCK_INIT() calls.
2023-08-25 15:42:03 +02:00
Frédéric Lécaille
745d1a269b MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs)
Store a pointer to the expression (struct pattern_expr) into the data structure
used to chain/store the map element references (struct pat_ref_elt) , e.g. the
struct pattern_tree when stored into an ebtree or struct pattern_list when
chained to a list.

Modify pat_ref_set_elt() to stop inspecting all the expressions attached to a map
and to look for the <elt> element passed as parameter to retrieve the sample data
to be parsed. Indeed, thanks to the pointer added above to each pattern tree nodes
or list elements, they all can be inspected directly from the <elt> passed as
parameter and its ->tree_head and ->list_head member: the pattern tree nodes are
stored into elt->tree_head, and the pattern list elements are chained to
elt->list_head list. This inspection was also the job of pattern_find_smp() which
is no more useful. This patch removes the code of this function.
2023-08-25 15:41:59 +02:00
Frédéric Lécaille
0844bed7d3 MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)
Organize reference to pattern element of map (struct pat_ref_elt) into an ebtree:
  - add an eb_root member to the map (pat_ref struct) and an ebpt_node to its
    element (pat_ref_elt struct),
  - modify the code to insert these nodes into their ebtrees each time they are
    allocated. This is done in pat_ref_append().

Note that ->head member (struct list) of map (struct pat_ref) is not removed
could have been removed. This is not the case because still necessary to dump
the map contents from the CLI in the order the map elememnts have been inserted.

This patch also modifies http_action_set_map() which is the callback at least
used by "set-map" action. The pat_ref_elt element returned by pat_ref_find_elt()
is no more ignored, but reused if not NULL by pat_ref_set() as first element to
lookup from. This latter is also modified to use the ebtree attached to the map
in place of the ->head list attached to each map element (pat_ref_elt struct).

Also modify pat_ref_find_elt() to makes it use ->eb_root map ebtree added to the
map by this patch in place of inspecting all the elements with a strcmp() call.
2023-08-25 15:41:56 +02:00
Amaury Denoyelle
5053e89142 MEDIUM: h2: prevent stream opening before connection reverse completed
HTTP/2 demux must be handled with care for active reverse connection.
Until accept has been completed, it should be forbidden to handle
HEADERS frame as session is not yet ready to handle streams.

To implement this, use the flag H2_CF_DEM_TOOMANY which blocks demux
process. This flag is automatically set just after conn_reverse()
invocation. The flag is removed on rev_accept_conn() callback via a new
H2 ctl enum. H2 tasklet is woken up to restart demux process.

As a side-effect, reporting in H2 mux may be blocked as demux functions
are used to convert error status at the connection level with
CO_FL_ERROR. To ensure error is reported for a reverse connection, check
h2c_is_dead() specifically for this case in h2_wake(). This change also
has its own side-effect : h2c_is_dead() conditions have been adjusted to
always exclude !h2c->conn->owner condition which is always true for
reverse connection or else H2 mux may kill them unexpectedly.
2023-08-24 17:03:08 +02:00