808 Commits

Author SHA1 Message Date
Willy Tarreau
fe47e8dfc5 MINOR: proxy: only check abortonclose through a dedicated function
In order to prepare for changing the way abortonclose works, let's
replace the direct flag check with a similarly named function
(proxy_abrt_close) which returns the on/off status of the directive
for the proxy. For now it simply reflects the flag's state.
2025-10-08 10:29:41 +02:00
Olivier Houchard
b01a00acb1 BUG/MEDIUM: connections: Only avoid creating a mux if we have one
In connect_server(), only avoid creating a mux when we're reusing a
connection, if that connection already has one. We can reuse a
connection with no mux, if we made a first attempt at connecting to the
server and it failed before we could create the mux (or during the mux
creation). The connection will then be reused when trying again.
This fixes a bug where a stream could stall if the first connection
attempt failed before the mux creation. It is easy to reproduce by
creating random memory allocation failure with -dmFail.
This was introduced by commit 4aaf0bfbced22d706af08725f977dcce9845d340,
and thus does not need any backport as long as that commit is not
backported.
2025-10-03 13:13:10 +02:00
Chris Staite
54f53bc875 MINOR: backend: srv_is_up converter
There is currently an srv_queue converter which is capable of taking the
output of a dynamic name and determining the queue length for a given
server.  In addition there is a sample fetcher for whether a server is
currently up.  This simply combines the two such that srv_is_up can be
used as a converter too.

Future work might extend this to other sample fetchers for servers, but
this is probably the most useful for acl routing.
2025-09-26 10:46:48 +02:00
Chris Staite
faba98c85f MINOR: backend: srv_queue helper
In preparation of providing further server converters, split the code
for finding the server from the sample out.

Additionally, update the documentation for srv_queue converter to note
security concerns.
2025-09-26 10:46:48 +02:00
Aurelien DARRAGON
5c299dee5a MEDIUM: stats: consider that shared stats pointers may be NULL
This patch looks huge, but it has a very simple goal: protect all
accessed to shared stats pointers (either read or writes), because
we know consider that these pointers may be NULL.

The reason behind this is despite all precautions taken to ensure the
pointers shouldn't be NULL when not expected, there are still corner
cases (ie: frontends stats used on a backend which no FE cap and vice
versa) where we could try to access a memory area which is not
allocated. Willy stumbled on such cases while playing with the rings
servers upon connection error, which eventually led to process crashes
(since 3.3 when shared stats were implemented)

Also, we may decide later that shared stats are optional and should
be disabled on the proxy to save memory and CPU, and this patch is
a step further towards that goal.

So in essence, this patch ensures shared stats pointers are always
initialized (including NULL), and adds necessary guards before shared
stats pointers are de-referenced. Since we already had some checks
for backends and listeners stats, and the pointer address retrieval
should stay in cpu cache, let's hope that this patch doesn't impact
stats performance much.
2025-09-18 16:49:51 +02:00
Willy Tarreau
2d6b5c7a60 MEDIUM: connection: reintegrate conn_hash_node into connection
Previously the conn_hash_node was placed outside the connection due
to the big size of the eb64_node that could have negatively impacted
frontend connections. But having it outside also means that one
extra allocation is needed for each backend connection, and that one
memory indirection is needed for each lookup.

With the compact trees, the tree node is smaller (16 bytes vs 40) so
the overhead is much lower. By integrating it into the connection,
We're also eliminating one pointer from the connection to the hash
node and one pointer from the hash node to the connection (in addition
to the extra object bookkeeping). This results in saving at least 24
bytes per total backend connection, and only inflates connections by
16 bytes (from 240 to 256), which is a reasonable compromise.

Tests on a 64-core EPYC show a 2.4% increase in the request rate
(from 2.08 to 2.13 Mrps).
2025-09-16 09:23:46 +02:00
Willy Tarreau
ceaf8c1220 MEDIUM: connection: move idle connection trees to ceb64
Idle connection trees currently require a 56-byte conn_hash_node per
connection, which can be reduced to 32 bytes by moving to ceb64. While
ceb64 is theoretically slower, in practice here we're essentially
dealing with trees that almost always contain a single key and many
duplicates. In this case, ceb64 insert and lookup functions become
faster than eb64 ones because all duplicates are a list accessed in
O(1) while it's a subtree for eb64. In tests it is impossible to tell
the difference between the two, so it's worth reducing the memory
usage.

This commit brings the following memory savings to conn_hash_node
(one per backend connection), and to srv_per_thread (one per thread
and per server):

     struct       before  after  delta
  conn_hash_nodea   56     32     -24
  srv_per_thread    96     72     -24

The delicate part is conn_delete_from_tree(), because we need to
know the tree root the connection is attached to. But thanks to
recent cleanups, it's now clear enough (i.e. idle/safe/avail vs
session are easy to distinguish).
2025-09-16 09:23:46 +02:00
Willy Tarreau
95b8adff67 MINOR: connection: pass the thread number to conn_delete_from_tree()
We'll soon need to choose the server's root based on the connection's
flags, and for this we'll need the thread it's attached to, which is
not always the current one. This patch simply passes the thread number
from all callers. They know it because they just set the idle_conns
lock on it prior to calling the function.
2025-09-16 09:23:46 +02:00
Willy Tarreau
3d18a0d4c2 CLEANUP: backend: factor the connection lookup loop
The connection lookup loop is made of two identical blocks, one looking
in the idle or safe lists and the other one looking into the safe list
only. The second one is skipped if a connection was found or if the request
looks for a safe one (since already done). Also the two are slightly
different due to leftovers from earlier versions in that the second one
checks for safe connections and not the first one, and the second one
sets is_safe which is not used later.

Let's just rationalize all this by placing them in a loop which checks
first from the idle conns and second from the safe ones, or skips the
first step if the request wants a safe connection. This reduces the
code and shortens the time spent under the lock.
2025-09-16 09:23:46 +02:00
Olivier Houchard
d4c51a4f57 MEDIUM: server: Make use of the stored ALPN stored in the server
Now that which ALPN gets negociated for a given server, use that to
decide if we can create the mux right away in connect_server(), and use
it in conn_install_mux_be().
That way, we may create the mux soon enough for early data to be sent,
before the handshake has been completed.
This commit depends on several previous commits, and it has not been
deemed important enough to backport.
2025-09-09 19:01:24 +02:00
Willy Tarreau
6a2b3269f9 CLEANUP: backend: clarify the cases where we want to use early data
The conditions to use early data on output are super tricky and
detected later, so that it's difficult to figure how this works. This
patch splits the condition in two parts, the one that can be performed
early that is based on config/client/etc. It is used to clear a variable
that allows early data to be used in case any condition is not satisfied.
It was purposely split into multiple independent and reviewable tests.

The second part remains where it was at the end, and is used to temporarily
clear the handshake flags to let the data layer use early data. This one
being tricky, a large comment explaining the principle was added.

The logic was not changed at all, only the code was made more readable.
2025-09-09 19:01:24 +02:00
Willy Tarreau
9b9d0720e1 CLEANUP: backend: simplify the complex ifdef related to 0RTT in connect_server()
Since 3.0 we have HAVE_SSL_0RTT precisely to avoid checking horribly
complicated and unmaintainable conditions to detect support for 0RTT.
Let's just drop the complex condition and use the macro instead.
2025-09-09 19:01:24 +02:00
Willy Tarreau
4aaf0bfbce CLEANUP: backend: invert the condition to start the mux in connect_server()
Instead of trying to switch from delayed start to instant start based
on a single condition, let's do the opposite and preset the condition
to instant start and detect what could cause it to be delayed, thus
falling back to the slow mode. The condition remains exactly the
inverted one and better matches the comment about ALPN being the only
cause of such a delay.
2025-09-09 19:01:24 +02:00
Willy Tarreau
7b4a7f92b5 CLEANUP: backend: clarify the role of the init_mux variable in connect_server()
The init_mux variable is currently used in a way that's not super easy
to grasp. It's set a bit too late and requires to know a lot of info at
once. Let's first rename it to "may_start_mux_now" to clarify its role,
as the purpose is not to *force* the mux to be initialized now but to
permit it to do it.
2025-09-09 19:01:24 +02:00
Christopher Faulet
52866349a1 OPTIM: backend: Don't set SNI for non-ssl connections
There is no reason to set the SNI for non-ssl connections. It is not really
an issue because ssl_sock_set_servername() function will do nothing. But
there is no reason to uselessly evaluate an expression.

No backport needed, because there is no bug.
2025-09-05 15:56:42 +02:00
Willy Tarreau
93cc18ac42 MAJOR: backend: switch the default balancing algo to "random"
For many years, an unset load balancing algorithm would use "roundrobin".
It was shown several times that "random" with at least 2 draws (the
default) generally provides better performance and fairness in that
it will automatically adapt to the server's load and capacity. This
was further described with numbers in this discussion:

  https://www.mail-archive.com/haproxy@formilux.org/msg46011.html
  https://github.com/orgs/haproxy/discussions/3042

BTW there were no objection and only support for the change.

The goal of this patch is to change the default algo when none is
specified, from "roundrobin" to "random". This way, users who don't
care and don't set the load balancing algorithm will benefit from a
better one in most cases, while those who have good reasons to prefer
roundrobin (for session affinity or for reproducible sequences like used
in regtests) can continue to specify it.

The vast majority of users should not notice a difference.
2025-09-04 08:30:35 +02:00
Amaury Denoyelle
21f7974e05 OPTIM: backend: set release on takeover for strict maxconn
When strict maxconn is enforced on a server, it may be necessary to kill
an idle connection to never exceed the limit. To be able to delete a
connection from any thread, takeover is first used to migrate it on the
current thread prior to its deletion.

As takeover is performed to delete a connection instead of reusing it,
<release> argument can be set to true. This removes unnecessary
allocations of resources prior to connection deletion. As such, this
patch is a small optimization for strict maxconn implementation.

Note that this patch depends on the previous one which removes any
assumption in takeover implementation that thread isolation is active if
<release> is true.
2025-08-28 16:11:32 +02:00
Amaury Denoyelle
ec1ab8d171 MINOR: session: remove redundant target argument from session_add_conn()
session_add_conn() uses three argument : connection and session
instances, plus a void pointer labelled as target. Typically, it
represents the server, but can also be a backend instance (for example
on dispatch).

In fact, this argument is redundant as <target> is already a member of
the connection. This commit simplifies session_add_conn() by removing
it. A BUG_ON() on target is extended to ensure it is never NULL.
2025-07-30 11:39:57 +02:00
Aurelien DARRAGON
c24de077bd OPTIM: stats: store fast sharded counters pointers at session and stream level
Following commit 75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing
the shared stats directly in counters struct"), in order to minimize the
impact of the recent sharded counters work, we try to push things a bit
further in this patch by storing and using "fast" pointers at the session
and stream levels when available to avoid costly indirections and
systematic "tgid" resolution (which can not be cached by the CPU due to
its THREAD-local nature).

Indeed, we know that a session/stream is tied to a given CPU, thanks to
this we know that the tgid for a given session/stream will never change.

Given that, we are able to store sharded frontend and listener counters
pointer at the session level (namely sess->fe_tgcounters and
sess->li_tgcounters), and once the backend and the server are selected,
we are also able to store backend and server sharded counters
pointer at the stream level (namely s->be_tgcounters and s->sv_tgcounters)

Everywhere we rely on these counters and the stream or session context is
available, we use the fast pointers it instead of the indirect pointers
path to make the pointer resolution a bit faster.

This optimization proved to bring a few percents back, and together with
the previous 75e480d10 commit we now fixed the performance regression (we
are back to back with 3.2 stats performance)
2025-07-25 18:24:23 +02:00
Aurelien DARRAGON
75e480d107 MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct
Between 3.2 and 3.3-dev we noticed a noticeable performance regression
due to stats handling. After bisecting, Willy found out that recent
work to split stats computing accross multiple thread groups (stats
sharding) was responsible for that performance regression. We're looking
at roughly 20% performance loss.

More precisely, it is the added indirections, multiplied by the number
of statistics that are updated for each request, which in the end causes
a significant amount of time being spent resolving pointers.

We noticed that the fe_counters_shared and be_counters_shared structures
which are currently allocated in dedicated memory since a0dcab5c
("MAJOR: counters: add shared counters base infrastructure")
are no longer huge since 16eb0fab31 ("MAJOR: counters: dispatch counters
over thread groups") because they now essentially hold flags plus the
per-thread group id pointer mapping, not the counters themselves.

As such we decided to try merging fe_counters_shared and
be_counters_shared in their parent structures. The cost is slight memory
overhead for the parent structure, but it allows to get rid of one
pointer indirection. This patch alone yields visible performance gains
and almost restores 3.2 stats performance.

counters_fe_shared_get() was renamed to counters_fe_shared_prepare() and
now returns either failure or success instead of a pointer because we
don't need to retrieve a shared pointer anymore, the function takes care
of initializing existing pointer.
2025-07-25 16:46:10 +02:00
Willy Tarreau
6ad9285796 CLEANUP: server: rename server_find_by_name() to server_find()
This function doesn't just look at the name but also the ID when the
argument starts with a '#'. So the name is not correct and explains
why this function is not always used when the name only is needed,
and why the list-based findserver() is used instead. So let's just
call the function "server_find()", and rename its generation-id based
cousin "server_find_unique()".
2025-07-15 10:30:28 +02:00
Aurelien DARRAGON
4fcc9b5572 MINOR: counters: rename last_change counter to last_state_change
Since proxy and server struct already have an internal last_change
variable and we cannot merge it with the shared counter one, let's
rename the last_change counter to be more specific and prevent the
mixup between the two.

last_change counter is renamed to last_state_change, and unlike the
internal last_change, this one is a shared counter so it is expected
to be updated by other processes in our back.

However, when updating last_state_change counter, we use the value
of the server/proxy last_change as reference value.
2025-06-30 16:26:38 +02:00
Aurelien DARRAGON
5b1480c9d4 MEDIUM: proxy: add and use a separate last_change variable for internal use
Same motivation as previous commit, proxy last_change is "abused" because
it is used for 2 different purposes, one for stats, and the other one
for process-local internal use.

Let's add a separate proxy-only last_change variable for internal use,
and leave the last_change shared (and thread-grouped) counter for
statistics.
2025-06-30 16:26:31 +02:00
Amaury Denoyelle
a0db93f3d8 MEDIUM: backend: delay MUX init with ALPN even if proto is forced
On backend side, multiplexer layer is initialized during
connect_server(). However, this step is not performed if ALPN is used,
as the negotiated protocol may be unknown. Multiplexer initialization is
delayed after TLS handshake completion.

There are still exceptions though that forces the MUX to be initialized
even if ALPN is used. One of them was if <mux_proto> server field was
already set at this stage, which is the case when an explicit proto is
selected on the server line configuration. Remove this condition so that
now MUX init is delayed with ALPN even if proto is forced.

The scope of this change should be minimal. In fact, the only impact
concerns server config with both proto and ALPN set, which is pretty
unlikely as it is contradictory.

The main objective of this patch is to prepare QUIC support on the
backend side. Indeed, QUIC proto will be forced on the server if a QUIC
address is used, similarly to bind configuration. However, we still want
to delay MUX initialization after QUIC handshake completion. This is
mandatory to know the selected application protocol, required during
QUIC MUX init.
2025-06-12 11:21:32 +02:00
Frederic Lecaille
7c76252d8a MINOR: quic-be: Correct the QUIC protocol lookup
From connect_server(), QUIC protocol could not be retreived by protocol_lookup()
because of the PROTO_TYPE_STREAM default passed as argument. In place to support
QUIC srv->addr_type.proto_type may be safely passed.
2025-06-11 18:37:34 +02:00
Aurelien DARRAGON
16eb0fab31 MAJOR: counters: dispatch counters over thread groups
Most fe and be counters are good candidates for being shared between
processes. They are now grouped inside "shared" struct sub member under
be_counters and fe_counters.

Now they are properly identified, they would greatly benefit from being
shared over thread groups to reduce the cost of atomic operations when
updating them. For this, we take the current tgid into account so each
thread group only updates its own counters. For this to work, it is
mandatory that the "shared" member from {fe,be}_counters is initialized
AFTER global.nbtgroups is known, because each shared counter causes the stat
to be allocated lobal.nbtgroups times. When updating a counter without
concurrency, the first counter from the array may be updated.

To consult the shared counters (which requires aggregation of per-tgid
individual counters), some helper functions were added to counter.h to
ease code maintenance and avoid computing errors.
2025-06-05 09:59:38 +02:00
Aurelien DARRAGON
a0dcab5c45 MAJOR: counters: add shared counters base infrastructure
Shareable counters are not tagged as shared counters and are dynamically
allocated in separate memory area as a prerequisite for being stored
in shared memory area. For now, GUID and threads groups are not taken into
account, this is only a first step.

also we ensure all counters are now manipulated using atomic operations,
namely, "last_change" counter is now read from and written to using atomic
ops.

Despite the numerous changes caused by the counters being moved away from
counters struct, no change of behavior should be expected.
2025-06-05 09:58:58 +02:00
Willy Tarreau
099c1b2442 BUG/MAJOR: queue: properly keep count of the queue length
The queue length was moved to its own variable in commit 583303c48
("MINOR: proxies/servers: Calculate queueslength and use it."), however a
few places were missed in pendconn_unlink() and assign_server_and_queue()
resulting in never decreasing counts on aborted streams. This was
reproduced when injecting more connections than the total backend
could stand in TCP mode and letting some of them time out in the
queue. No backport is needed, this is only 3.2.
2025-05-17 10:46:10 +02:00
Olivier Houchard
b138eab302 BUG/MEDIUM: connections: Report connection closing in conn_create_mux()
Add an extra parametre to conn_create_mux(), "closed_connection".
If a pointer is provided, then let it know if the connection was closed.
Callers have no way to determine that otherwise, and we need to know
that, at least in ssl_sock_io_cb(), as if the connection was closed we
need to return NULL, as the tasklet was free'd, otherwise that can lead
to memory corruption and crashes.
This should be backported if 9240cd4a2771245fae4d0d69ef025104b14bfc23
is backported too.
2025-04-30 17:17:36 +02:00
Willy Tarreau
7b6df86a83 BUG/MINOR: backend: do not use the source port when hashing clientip
The server's "usesrc" keyword supports among other options "client"
and "clientip". The former means we bind to the client's IP and port
to connect to the server, while the latter means we bind to its IP
only. It's done in two steps, first alloc_bind_address() retrieves
the IP address and port, and second, tcp_connect_server() decides
to either bind to the IP only or IP+port.

The problem comes with idle connection pools, which hash all the
parameters: the hash is calculated before (and ideally withouy) calling
tcp_connect_server(), and it considers the whole struct sockaddr_storage
for the hash, except that both client and clientip entirely fill it with
the client's address. This means that both client and clientip make use
of the source port in the hash calculation, making idle connections
almost not reusable when using "usesrc clientip" while they should for
clients coming from the same source. A work-around is to force the
source port to zero using "tcp-request session set-src-port int(0)" but
it's ugly.

Let's fix this by properly zeroing the port for AF_INET/AF_INET6 addresses.

This can be backported to 2.4. Thanks to Sebastien Gross for providing a
reproducer for this problem.
2025-04-09 11:05:22 +02:00
Amaury Denoyelle
43367f94f1 MINOR: check/backend: support conn reuse with SNI
Support for connection reuse during server checks was implemented
recently. This is activated with the server keyword check-reuse-pool.

Similarly to stream processing via connect_backend(), a connection hash
is calculated when trying to perform reuse for checks. This is necessary
to retrieve for a connection which shares the check connect parameters.
However, idle connections can additionnally be tagged using a
pool-conn-name or SNI under connect_backend(). Check reuse does not test
these values, which prevent to retrieve a matching connection.

Improve this by using "check-sni" value as idle connection hash input
for check reuse. be_calculate_conn_hash() API has been adjusted so that
name value can be passed as input, both when using streams or checks.

Even with the current patch, there is still some scenarii which could
not be covered for checks connection reuse. most notably, when using
dynamic pool-conn-name/SNI value. It is however at least sufficient to
cover simpler cases.
2025-04-03 17:19:07 +02:00
Amaury Denoyelle
76e9156c9b MINOR: backend: mark srv as nonnull in alloc_dst_address()
Server instance can be NULL on connect_server(), either when dispatch or
transparent proxy are active. However, in alloc_dst_address() access to
<srv> is safe thanks to SF_ASSIGNED stream flag. Add an ASSUME_NONNULL()
to reflect this state.

This should fix coverity report from github issue #2922.
2025-04-03 17:19:07 +02:00
Willy Tarreau
870f7aa5cf BUILD: backend: silence a build warning when not using ssl
Since recent commit ee94a6cfc1 ("MINOR: backend: extract conn reuse
from connect_server()") a build warning "set but not used" on the
"reuse" variable is emitted, because indeed the variable is now only
checked when SSL is in use. Let's just mark it as such.
2025-04-02 15:26:31 +02:00
Amaury Denoyelle
f1fb396d71 MEDIUM: check: implement check-reuse-pool
Implement the possibility to reuse idle connections when performing
server checks. This is done thanks to the recently introduced functions
be_calculate_conn_hash() and be_reuse_connection().

One side effect of this change is that be_calculate_conn_hash() can now
be called with a NULL stream instance. As such, part of the functions
are adjusted accordingly.

Note that to simplify configuration, connection reuse is not performed
if any specific check connection parameters are defined on the server
line or via the tcp-check connect rule. This is performed via newly
defined tcpcheck_use_nondefault_connect().
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
20eb57b486 MINOR: backend: remove stream usage on connection reuse
Adjust newly defined be_reuse_connection() API. The stream argument is
removed. This will allows checks to be able to invoke it without relying
on a stream instance.
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
ee94a6cfc1 MINOR: backend: extract conn reuse from connect_server()
Following the previous patch, the part directly related to connection
reuse is extracted from connect_server(). It is now define in a new
function be_reuse_connection().
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
c7cc6b6401 MINOR: backend: extract conn hash calculation from connect_server()
On connection reuse, a hash is first calculated. It is generated from
various connection parameters, to retrieve a matching connection.

Extract hash calculation from connect_server() into a new dedicated
function be_calculate_conn_hash(). The objective is to be able to
perform connection reuse for checks, without connect_server() invokation
which relies on a stream instance.
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
4f0240f9a4 MINOR: backend: adjust conn_backend_get() API
The main objective of this patch is to remove the stream instance from
conn_backend_get() parameters. This would allow to perform reuse outside
of stream contexts, for example for checks purpose.
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
2ca616b4e1 MINOR: backend: fix comment when killing idle conns
Previously, if a server reached its pool-high-count limit, connection
were killed on connect_server() when reuse was not possible. However,
this is now performed even if reuse is done since the following patch :
  b3397367dc7cec9e78c62c54efc24d9db5cde2d2
  MEDIUM: connections: Kill connections even if we are reusing one.

Thus, adjust the related comment to reflect this state.
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
5fda64e87e BUG/MEDIUM: backend: fix reuse with set-dst/set-dst-port
On backend connection reuse, a hash is calculated from various
parameters, to ensure the selected connection match the requested
parameters. Notably, destination address is one of these parameters.
However, it is only taken into account if using a transparent server
(server address 0.0.0.0).

This may cause issue where an incorrect connection is reused, which is
not targetted to the correct destination address. This may be the case
if a set-dst/set-dst-port is used with a transparent proxy (proxy option
transparent).

The fix is simple enough. Destination address is now always used as
input to the connection reuse hash.

This must be backported up to 2.6. Note that for reverse HTTP to work,
it relies on the following patch, which ensures destination address
remains NULL in this case.

  commit e94baf6ca71cb2319610baa74dbf17b9bc602b18
  BUG/MINOR: rhttp: fix incorrect dst/dst_port values
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
d7fa8e88c4 BUG/MINOR: backend: do not overwrite srv dst address on reuse
Previously, destination address of backend connection was systematically
always reassigned. However, this step is unnecessary on connection
reuse. Indeed, reuse should only be conducted with connection using the
same destination address matching the stream requirements.

This patch removes this unnecessary assignment. It is now only performed
when reuse cannot be conducted and a new connection is instantiated.

Functionnally speaking, this patch should not change anything in theory,
as reuse is performed in conformance with the destination address.
However, it appears that it was not always properly enforced. The
systematic assignment of the destination address hides these issues, so
it is now remove. The identified bogus cases will then be fixed in the
following patches.would

This should be backported up to all stable versions.
2025-04-02 14:57:40 +02:00
Amaury Denoyelle
c05bb8c967 BUG/MINOR: rhttp: fix incorrect dst/dst_port values
With a @rhttp server, connect is not possible, transfer is only possible
via idle connection reuse. The server does not have any network address.

Thus, it is unnecessary to allocate the stream destination address prior
to connection reuse. This patch adjusts this by fixing
alloc_dst_address() to take this into account.

Prior to this patch, alloc_dst_address() would incorrectly assimilate a
@rhttp server with a transparent proxy mode. Thus stream destination
address would be copied from the destination address. Connection adress
would then be rewrote with this incorrect value. This did not impact
connect or reuse as destination addr is only used in idle conn hash
calculation for transparent servers. However, it causes incorrect values
for dst/dst_port samples.

This should be backported up to 2.9.
2025-04-02 14:57:40 +02:00
Ilia Shipitsin
78b849b839 CLEANUP: assorted typo fixes in the code and comments
code, comments and doc actually.
2025-04-02 11:12:20 +02:00
Willy Tarreau
78ef52dbd1 BUILD: backend: silence a build warning when threads are disabled
Since commit 8de8ed4f48 ("MEDIUM: connections: Allow taking over
connections from other tgroups.") we got this partially absurd
build warning when disabling threads:

  src/backend.c: In function 'conn_backend_get':
  src/backend.c:1371:27: warning: array subscript [0, 0] is outside array bounds of 'struct tgroup_info[1]' [-Warray-bounds]

The reason is that gcc sees that curtgid is not equal to tgid which is
defined as 1 in this case, thus it figures that tgroup_info[curtgid-1]
will be anything but zero and that doesn't fit. It is ridiculous as it
is a perfect case of dead code elimination which should not warrant a
warning. Nevertheless we know we don't need to do this when threads are
disabled and in this case there will not be more than 1 thread group, so
we can happily use that preliminary test to help the compiler eliminate
the dead condition and avoid spitting this warning.

No backport is needed.
2025-03-12 18:16:14 +01:00
Willy Tarreau
730641f7ca BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer
As reported in issue #2882, using "no-send-proxy-v2" on a server line does
not properly disable the use of proxy-protocol if it was enabled in a
default-server directive in combination with other PP options. The reason
for this is that the sending of a proxy header is determined by a test on
srv->pp_opts without any distinction, so disabling PPv2 while leaving other
options results in a PPv1 header to be sent.

Let's fix this by explicitly testing for the presence of either send-proxy
or send-proxy-v2 when deciding to send a proxy header.

This can be backported to all versions. Thanks to Andre Sencioles (@asenci)
for reporting the issue and testing the fix.
2025-03-03 04:05:47 +01:00
Olivier Houchard
706b008429 MEDIUM: servers: Add strict-maxconn.
Maxconn is a bit of a misnomer when it comes to servers, as it doesn't
control the maximum number of connections we establish to a server, but
the maximum number of simultaneous requests. So add "strict-maxconn",
that will make it so we will never establish more connections than
maxconn.
It extends the meaning of the "restricted" setting of
tune.takeover-other-tg-connections, as it will also attempt to get idle
connections from other thread groups if strict-maxconn is set.
2025-02-26 13:00:18 +01:00
Olivier Houchard
8de8ed4f48 MEDIUM: connections: Allow taking over connections from other tgroups.
Allow haproxy to take over idle connections from other thread groups
than our own. To control that, add a new tunable,
tune.takeover-other-tg-connections. It can have 3 values, "none", where
we won't attempt to get connections from the other thread group (the
default), "restricted", where we only will try to get idle connections
from other thread groups when we're using reverse HTTP, and "full",
where we always try to get connections from other thread groups.
Unless there is a special need, it is advised to use "none" (or
restricted if we're using reverse HTTP) as using connections from other
thread groups may have a performance impact.
2025-02-26 13:00:18 +01:00
Willy Tarreau
a826250659 OPTIM: connection: don't try to kill other threads' connection when !shared
Users may have good reasons for using "tune.idle-pool.shared off", one of
them being the cost of moving cache lines between cores, or the kernel-
side locking associated with moving FDs. For this reason, when getting
close to the file descriptors limits, we must not try to kill adjacent
threads' FDs when the sharing of pools is disabled. This is extremely
expensive and kills the performance. We must limit ourselves to our local
FDs only. In such cases, it's up to the users to configure a large enough
maxconn for their usages.

Before this patch, perf top reported 9% CPU usage in connect_server()
onthe trylock used to kill connections when running at 4800 conns for
a global maxconn of 6400 on a 128-thread server. Now it doesn't spend
its time there anymore, and performance has increased by 12%. Note,
it was verified that disabling the locks in such a case has no effect
at all, so better keep them and stay safe.
2025-02-25 09:23:46 +01:00
Olivier Houchard
26b3e5236f MEDIUM: servers/proxies: Switch to using per-tgroup queues.
For both servers and proxies, use one connection queue per thread-group,
instead of only one. Having only one can lead to severe performance
issues on NUMA machines, it is actually trivial to get the watchdog to
trigger on an AMD machine, having a server with a maxconn of 96, and an
injector that uses 160 concurrent connections.
We now have one queue per thread-group, however when dequeueing, we're
dequeuing MAX_SELF_USE_QUEUE (currently 9) pendconns from our own queue,
before dequeueing one from another thread group, if available, to make
sure everybody is still running.
2025-01-28 12:49:41 +01:00
Olivier Houchard
583303c48b MINOR: proxies/servers: Calculate queueslength and use it.
For both proxies and servers, properly calculates queueslength, which is
the total number of element in each queues (as they currently are only
using one queue, it is equivalent to the number of element of that
queue), and use it instead of the queue's length.
2025-01-28 12:49:41 +01:00