Commit Graph

24621 Commits

Author SHA1 Message Date
Ilia Shipitsin
61b30a09c0 CI: AWS-LC: enable unit tests
Run the new make unit-tests on the CI.
2025-05-14 17:00:31 +02:00
Ilia Shipitsin
944a96156e CI: AWS-LC(fips): enable unit tests
Run the new make unit-tests on the CI.
2025-05-14 17:00:31 +02:00
Willy Tarreau
e24b77e765 DOC: config: move the extraneous sections out of the "global" definition
Due to some historic mistakes that have spread to newly added sections,
a number of of recently added small sections found themselves described
under section 3 "global parameters" which is specific to "global" section
keywords. This is highly confusing, especially given that sections 3.1,
3.2, 3.3 and 3.10 directly start with keywords valid in the global section,
while others start with keywords that describe a new section.

Let's just create a new chapter "12. other sections" and move them all
there. 3.10 "HTTPclient tuning" however was moved to 3.4 as it's really
a definition of the global options assigned to the HTTP client. The
"programs" that are going away in 3.3 were moved at the end to avoid a
renumbering later.

Another nice benefit is that it moves a lot of text that was previously
keeping the global and proxies sections apart.
2025-05-14 16:08:02 +02:00
Willy Tarreau
da67a89f30 DOC: config: move stick-tables and peers to their own section
As suggested by Tim in issue #2953, stick-tables really deserve their own
section to explain the configuration. And peers have to move there as well
since they're totally dedicated to stick-tables.

Now we introduce a new section "Stick-tables and Peers", explaining the
concepts, and under which there is one subsection for stick-tables
configuration and one for the peers (which mostly keeps the existing
peers section).
2025-05-14 16:08:02 +02:00
Willy Tarreau
423dffa308 DOC: config: move address formats definition to section 2
Section 2 describes the config file format, variables naming etc, so
there's no reason why the address format used in this file should be
in a separate section, let's bring it into section 2 as well.
2025-05-14 16:08:02 +02:00
Christopher Faulet
e2ae8a74e8 DEBUG: mux-spop: Review some trace messages to adjust the message or the level
Some trace messages were not really accurrate, reporting a CLOSED connection
while only an error was reported on it. In addition, an TRACE_ERROR() was
used to report a short read on HELLO/DISCONNECT frames header. But it is not
an error. a TRACE_DEVEL() should be used instead.

This patch could be backported to 3.1 to ease future backports.
2025-05-14 11:52:10 +02:00
Christopher Faulet
6e46f0bf93 BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data
When an read error is detected, no error must be reported on the SPOP
connection is there are still some data to parse. It is important to be sure
to process all data before reporting the error and be sure to not truncate
received frames. However, we must also take care to handle short read case
to not wait data that will never be received.

This patch must be backported to 3.1.
2025-05-14 11:51:58 +02:00
Christopher Faulet
16314bb93c BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error
There was no test in the demux part to detect truncated frames and to report
an error at the connection level. The SPOP streams were properly switch to
half-closed state. But waiting the associated SPOE applets were woken up and
released, the SPOP connection could be woken up several times for nothing. I
never triggered the watchdog in that case, but it is not excluded.

Now, at the end of the demux function, if a specific test was added to
detect truncated frames to report an error and close the connection.

This patch must be backported to 3.1.
2025-05-14 11:47:41 +02:00
Christopher Faulet
71feb49a9f BUG/MEDIUM: spop-conn: Report short read for partial frames payload
When a frame was not fully received, a short read must be reported on the
SPOP connection to help the demux to handle truncated frames. This was
performed for frames truncated on the header part but not on the payload
part. It is now properly detected.

This patch must be backported to 3.1.
2025-05-14 09:20:10 +02:00
Christopher Faulet
ddc5f8d92e BUG/MEDIUM: mux-spop: Properly handle CLOSING state
The CLOSING state was not handled at all by the SPOP multiplexer while it is
mandatory when a DISCONNECT frame was sent and the mux should wait for the
DISCONNECT frame in reply from the agent. Thanks to this patch, it should be
fixed.

In addition, if an error occurres during the AGENT HELLO frame parsing, the
SPOP connection is no longer switched to CLOSED state and remains in ERROR
state instead. It is important to be able to send the DISCONNECT frame to
the agent instead of closing the TCP connection immediately.

This patch depends on following commits:

  * BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state
  * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
  * BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error
  * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer

All the series must be backported to 3.1.
2025-05-14 09:14:12 +02:00
Christopher Faulet
a3940614c2 BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state
SPOP_CS_FRAME_H and SPOP_CS_FRAME_P states, that were used to handle frame
parsing, were removed. The demux process now relies on the demux stream ID
to know if it is waiting for the frame header or the frame
payload. Concretly, when the demux stream ID is not set (dsi == -1), the
demuxer is waiting for the next frame header. Otherwise (dsi >= 0), it is
waiting for the frame payload. It is especially important to be able to
properly handle DISCONNECT frames sent by the agents.

SPOP_CS_RUNNING state is introduced to know the hello handshake was finished
and the SPOP connection is able to open SPOP streams and exchange NOTIFY/ACK
frames with the agents.

It depends on the following fixes:

  * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
  * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer

This change will be mandatory for the next fix. It must be backported to 3.1
with the commits above.
2025-05-13 19:51:40 +02:00
Christopher Faulet
6b0f7de4e3 MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
After the ACK frame was parsed, it is useless to set the SPOP connection
state to SPOP_CS_FRAME_H state because this will be automatically handled by
the demux function. If it is not an issue, but this will simplify changes
for the next commit.
2025-05-13 19:51:40 +02:00
Christopher Faulet
197eaaadfd BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error
Till now, only SPOP connections fully closed or those with a TCP connection on
error were concerned. But available streams could be reported for SPOP
connections in error or closing state. But in these states, no NOTIFY frames
will be sent and no ACK frames will be parsed. So, no new SPOP streams should be
opened.

This patch should be backported to 3.1.
2025-05-13 19:51:40 +02:00
Christopher Faulet
cbc10b896e BUG/MINOR: mux-spop: Make the demux stream ID a signed integer
The demux stream ID of a SPOP connection, used when received frames are
parsed, must be a signed integer because it is set to -1 when the SPOP
connection is initialized. It will be important for the next fix.

This patch must be backported to 3.1.
2025-05-13 19:51:40 +02:00
Christopher Faulet
6d68beace5 BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received
When a SPOP connection was closed or was in error, an error was
systematically reported on all its SPOP streams. However, SPOP streams that
already received their ACK frame must be excluded. Otherwise if an agent
sends a ACK and close immediately, the ACK will be ignored because the SPOP
stream will handle the error first.

This patch must be backported to 3.1.
2025-05-13 19:51:40 +02:00
Christopher Faulet
1cd30c998b BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state
When the SPOE applet was released, if a SPOE filter context was still
attached to it, an error was reported to the filter. However, there is no
reason to report an error if the ACK message was already received. Because
of this bug, if the ACK message is received and the SPOE connection is
immediately closed, this prevents the ACK message to be processed.

This patch should be backported to 3.1.
2025-05-13 19:51:40 +02:00
Christopher Faulet
dcce02d6ed DOC: config: Fix a typo in the "term_events" definition
A space was missing before the colon.
2025-05-13 19:51:40 +02:00
Christopher Faulet
a5de0e1595 BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation
When the channel API was revisted, the both functions above was added. An
offset can be passed as argument. However, this parameter could be reported
to be out of range if there was not enough input data was received yet. It
is an issue, especially with a tcp rule, because more data could be
received. If an error is reported too early, this prevent the rule to be
reevaluated later. In fact, an error should only be reported if the offset
is part of the output data.

Another issue is about the conditions to report 'nil' instead of an empty
string. 'nil' was reported when no data was found. But it is not aligned
with the documentation. 'nil' must only be returned if no more data cannot
be received and there is no input data at all.

This patch should fix the issue #2716. It should be backported as far as 2.6.
2025-05-13 19:51:40 +02:00
Willy Tarreau
e049bd00ab MEDIUM: config: change default limits to 1024 threads and 32 groups
A test run on a dual-socket EPYC 9845 (2x160 cores) showed that we'll
be facing new limits during the lifetime of 3.2 with our current 16
groups and 256 threads max:

  $ cat test.cfg
  global
      cpu-policy perforamnce

  $ ./haproxy -dc -c -f test.cfg
  ...
  Thread CPU Bindings:
    Tgrp/Thr  Tid        CPU set
    1/1-32    1-32       32: 0-15,320-335
    2/1-32    33-64      32: 16-31,336-351
    3/1-32    65-96      32: 32-47,352-367
    4/1-32    97-128     32: 48-63,368-383
    5/1-32    129-160    32: 64-79,384-399
    6/1-32    161-192    32: 80-95,400-415
    7/1-32    193-224    32: 96-111,416-431
    8/1-32    225-256    32: 112-127,432-447

Raising the default limit to 1024 threads and 32 groups is sufficient
to buy us enough margin for a long time (hopefully, please don't laugh,
you, reader from the future):

  $ ./haproxy -dc -c -f test.cfg
  ...
  Thread CPU Bindings:
    Tgrp/Thr  Tid        CPU set
    1/1-32    1-32       32: 0-15,320-335
    2/1-32    33-64      32: 16-31,336-351
    3/1-32    65-96      32: 32-47,352-367
    4/1-32    97-128     32: 48-63,368-383
    5/1-32    129-160    32: 64-79,384-399
    6/1-32    161-192    32: 80-95,400-415
    7/1-32    193-224    32: 96-111,416-431
    8/1-32    225-256    32: 112-127,432-447
    9/1-32    257-288    32: 128-143,448-463
    10/1-32   289-320    32: 144-159,464-479
    11/1-32   321-352    32: 160-175,480-495
    12/1-32   353-384    32: 176-191,496-511
    13/1-32   385-416    32: 192-207,512-527
    14/1-32   417-448    32: 208-223,528-543
    15/1-32   449-480    32: 224-239,544-559
    16/1-32   481-512    32: 240-255,560-575
    17/1-32   513-544    32: 256-271,576-591
    18/1-32   545-576    32: 272-287,592-607
    19/1-32   577-608    32: 288-303,608-623
    20/1-32   609-640    32: 304-319,624-639

We can change this default now because it has no functional effect
without any configured cpu-policy, so this will only be an opt-in
and it's better to do it now than to have an effect during the
maintenance phase. A tiny effect is a doubling of the number of
pool buckets and stick-table shards internally, which means that
aside slightly reducing contention in these areas, a dump of tables
can enumerate keys in a different order (hence the adjustment in the
vtc).

The only really visible effect is a slightly higher static memory
consumption (29->35 MB on a small config), but that difference
remains even with 50k servers so that's pretty much acceptable.

Thanks to Erwan Velu for the quick tests and the insights!
2025-05-13 18:15:33 +02:00
Willy Tarreau
158da59c34 MEDIUM: cpu-topo: prefer grouping by CCX for "performance" and "efficiency"
Most of the time, machines made of multiple CPU types use the same L3
for them, and grouping CPUs by frequencies to form groups doesn't bring
any value and on the opposite can impair the incoming connection balancing.
This choice of grouping by cluster was made in order to constitute a good
choice on homogenous machines as well, so better rely on the per-CCX
grouping than the per-cluster one in this case. This will create less
clusters on machines where it counts without affecting other ones.

It doesn't seem necessary to change anything for the "resource" policy
since it selects a single cluster.
2025-05-13 16:48:30 +02:00
Willy Tarreau
70b0dd6b0f MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity
This is similar to the previous change to the "performance" policy but
it applies to the "efficiency" one. Here we're changing the sorting
method to sort CPU clusters by average per-CPU capacity, and we evict
clusters whose per-CPU capacity is above 125% of the previous one.
Per-core capacity allows to detect discrepancies between CPU cores,
and to continue to focus on efficient ones as a priority.
2025-05-13 16:48:30 +02:00
Willy Tarreau
6c88e27cf4 MEDIUM: cpu-topo: change "performance" to consider per-core capacity
Running the "performance" policy on highly heterogenous systems yields
bad choices when there are sufficiently more small than big cores,
and/or when there are multiple cluster types, because on such setups,
the higher the frequency, the lower the number of cores, despite small
differences in frequencies. In such cases, we quickly end up with
"performance" only choosing the small or the medium cores, which is
contrary to the original intent, which was to select performance cores.
This is what happens on boards like the Orion O6 for example where only
the 4 medium cores and 2 big cores are choosen, evicting the 2 biggest
cores and the 4 smallest ones.

Here we're changing the sorting method to sort CPU clusters by average
per-CPU capacity, and we evict clusters whose per-CPU capacity falls
below 80% of the previous one. Per-core capacity allows to detect
discrepancies between CPU cores, and to continue to focus on high
performance ones as a priority.
2025-05-13 16:48:30 +02:00
Willy Tarreau
5ab2c815f1 MINOR: cpu-topo: provide a function to sort clusters by average capacity
The current per-capacity sorting function acts on a whole cluster, but
in some setups having many small cores and few big ones, it becomes
easy to observe an inversion of metrics where the many small cores show
a globally higher total capacity than the few big ones. This does not
necessarily fit all use cases. Let's add new a function to sort clusters
by their per-cpu average capacity to cover more use cases.
2025-05-13 16:48:30 +02:00
Willy Tarreau
01df98adad MINOR: cpu-topo: add a new "group-by-ccx" CPU policy
This cpu-policy will only consider CCX and not clusters. This makes
a difference on machines with heterogenous CPUs that generally share
the same L3 cache, where it's not desirable to create multiple groups
based on the CPU types, but instead create one with the different CPU
types. The variants "group-by-2/3/4-ccx" have also been added.

Let's also add some text explaining the difference between cluster
and CCX.
2025-05-13 16:48:30 +02:00
Willy Tarreau
33d8b006d4 BUG/MINOR: cpu-topo: fix group-by-cluster policy for disordered clusters
Some (rare) boards have their clusters in an erratic order. This is
the case for the Radxa Orion O6 where one of the big cores appears as
CPU0 due to booting from it, then followed by the small cores, then the
medium cores, then the remaining big cores. This results in clusters
appearing this order: 0,2,1,0.

The core in cpu_policy_group_by_cluster() expected ordered clusters,
and performs ordered comparisons to decide whether a CPU's cluster has
already been taken care of. On the board above this doesn't work, only
clusters 0 and 2 appear and 1 is skipped.

Let's replace the cluster number comparison with a cpuset to record
which clusters have been taken care of. Now the groups properly appear
like this:

  Tgrp/Thr  Tid        CPU set
  1/1-2     1-2        2: 0,11
  2/1-4     3-6        4: 1-4
  3/1-6     7-12       6: 5-10

No backport is needed, this is purely 3.2.
2025-05-13 16:48:30 +02:00
Amaury Denoyelle
f3b9676416 MINOR: quic: display stream age
Add a field to save the creation date of qc_stream_desc instance. This
is useful to display QUIC stream age in "show quic stream" output.
2025-05-13 15:44:22 +02:00
Amaury Denoyelle
dbf07c754e MINOR: quic: display QCS info on "show quic stream"
Complete stream output for "show quic" by displaying information from
its upper QCS. Note that QCS may be NULL if already released, so a
default output is also provided.
2025-05-13 15:43:28 +02:00
Amaury Denoyelle
cbadfa0163 MINOR: quic: add stream format for "show quic"
Add a new format for "show quic" command labelled as "stream". This is
an equivalent of "show sess", dedicated to the QUIC stack. Each active
QUIC streams are listed on a line with their related infos.

The main objective of this command is to ensure there is no freeze
streams remaining after a transfer.
2025-05-13 15:41:51 +02:00
Amaury Denoyelle
1ccede211c MINOR: mux-quic: account Rx data per stream
Add counters to measure Rx buffers usage per QCS. This reused the newly
defined bdata_ctr type already used for Tx accounting.

Note that for now, <tot> value of bdata_ctr is not used. This is because
it is not easy to account for data accross contiguous buffers.

These values are displayed both on log/traces and "show quic" output.
2025-05-13 15:41:51 +02:00
Amaury Denoyelle
a1dc9070e7 MINOR: quic: account Tx data per stream
Add accounting at qc_stream_desc level to be able to report the number
of allocated Tx buffers and the sum of their data. This represents data
ready for emission or already emitted and waiting on ACK.

To simplify this accounting, a new counter type bdata_ctr is defined in
quic_utils.h. This regroups both buffers and data counter, plus a
maximum on the buffer value.

These values are now displayed on QCS info used both on logline and
traces, and also on "show quic" output.
2025-05-13 15:41:41 +02:00
Willy Tarreau
9a05c1f574 BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly
As discussed here:
   https://github.com/httpwg/http2-spec/pull/936
   https://github.com/haproxy/haproxy/issues/2941

It's important to take care of some special characters in the :authority
pseudo header before reassembling a complete URI, because after assembly
it's too late (e.g. the '/'). This patch does this, both for h2 and h3.

The impact on H2 was measured in the worst case at 0.3% of the request
rate, while the impact on H3 is around 1%, but H3 was about 1% faster
than H2 before and is now on par.

It may be backported after a period of observation, and in this case it
relies on this previous commit:

   MINOR: http: add a function to validate characters of :authority

Thanks to @DemiMarie for reviving this topic in issue #2941 and bringing
new potential interesting cases.
2025-05-12 18:02:47 +02:00
Willy Tarreau
ebab479cdf MINOR: http: add a function to validate characters of :authority
As discussed here:
  https://github.com/httpwg/http2-spec/pull/936
  https://github.com/haproxy/haproxy/issues/2941

It's important to take care of some special characters in the :authority
pseudo header before reassembling a complete URI, because after assembly
it's too late (e.g. the '/').

This patch adds a specific function which was checks all such characters
and their ranges on an ist, and benefits from modern compilers
optimizations that arrange the comparisons into an evaluation tree for
faster match. That's the version that gave the most consistent performance
across various compilers, though some hand-crafted versions using bitmaps
stored in register could be slightly faster but super sensitive to code
ordering, suggesting that the results might vary with future compilers.
This one takes on average 1.2ns per character at 3 GHz (3.6 cycles per
char on avg). The resulting impact on H2 request processing time (small
requests) was measured around 0.3%, from 6.60 to 6.618us per request,
which is a bit high but remains acceptable given that the test only
focused on req rate.

The code was made usable both for H2 and H3.
2025-05-12 18:02:47 +02:00
Aurelien DARRAGON
c40d6ac840 BUG/MINOR: server: perform lbprm deinit for dynamic servers
Last commit 7361515 ("BUG/MINOR: server: dont depend on proxy for server
cleanup in srv_drop()") introduced a regression because the lbprm
server_deinit is not evaluated anymore with dynamic servers, possibly
resulting in a memory leak.

To fix the issue, in addition to free_proxy(), the server deinit check
should be manually performed in cli_parse_delete_server() as well.

No backport needed.
2025-05-12 16:29:36 +02:00
Aurelien DARRAGON
736151556c BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop()
In commit b5ee8bebfc ("MINOR: server: always call ssl->destroy_srv when
available"), we made it so srv_drop() doesn't depend on proxy to perform
server cleanup.

It turns out this is now mandatory, because during deinit, free_proxy()
can occur before the final srv_drop(). This is the case when using Lua
scripts for instance.

In 2a9436f96 ("MINOR: lbprm: Add method to deinit server and proxy") we
added a freeing check under srv_drop() that depends on the proxy.
Because of that UAF may occur during deinit when using a Lua script that
manipulate server objects.

To fix the issue, let's perform the lbprm server deinit logic under
free_proxy() directly, where the DEINIT server hooks are evaluated.

Also, to prevent similar bugs in the future, let's explicitly document
in srv_drop() that server cleanups should assume that the proxy may
already be freed.

No backport needed unless 2a9436f96 is.
2025-05-12 16:17:26 +02:00
Willy Tarreau
be4d816be2 BUG/MINOR: cfgparse: improve the empty arg position report's robustness
OSS Fuzz found that the previous fix ebb19fb367 ("BUG/MINOR: cfgparse:
consider the special case of empty arg caused by \x00") was incomplete,
as the output can sometimes be larger than the input (due to variables
expansion) in which case the work around to try to report a bad arg will
fail. While the parse_line() function has been made more robust now in
order to avoid this condition, let's fix the handling of this special
case anyway by just pointing to the beginning of the line if the supposed
error location is out of the line's buffer.

All details here:
   https://oss-fuzz.com/testcase-detail/5202563081502720

No backport is needed unless the fix above is backported.
2025-05-12 16:11:15 +02:00
Willy Tarreau
2b60e54fb1 BUG/MINOR: tools: improve parse_line()'s robustness against empty args
The fix in 10e6d0bd57 ("BUG/MINOR: tools: only fill first empty arg when
not out of range") was not that good. It focused on protecting against
<arg> becoming out of range to detect we haven't emitted anything, but
it's not the right way to detect this. We're always maintaining arg_start
as a copy of outpos, and that later one is incremented when emitting a
char, so instead of testing args[arg] against out+arg_start, we should
instead check outpos against arg_start, thereby eliminating the <out>
offset and the need to access args[]. This way we now always know if
we've emitted an empty arg without dereferencing args[].

There's no need to backport this unless the fix above is also backported.
2025-05-12 16:11:15 +02:00
Aurelien DARRAGON
7d057e56af BUG/MINOR: threads: fix soft-stop without multithreading support
When thread support is disabled ("USE_THREAD=" or "USE_THREAD=0" when
building), soft-stop doesn't work as haproxy never ends after stopping
the proxies.

This used to work fine in the past but suddenly stopped working with
ef422ced91 ("MEDIUM: thread: make stopping_threads per-group and add
stopping_tgroups") because the "break;" instruction under the stopping
condition is never executed when support for multithreading is disabled.

To fix the issue, let's add an "else" block to run the "break;"
instruction when USE_THREAD is not defined.

It should be backported up to 2.8
2025-05-12 14:18:39 +02:00
William Lallemand
8b0d1a4113 MINOR: ssl/ckch: warn when the same keyword was used twice
When using a crt-list or a crt-store, keywords mentionned twice on the
same line overwritte the previous value.

This patch emits a warning when the same keyword is found another time
on the same line.
2025-05-09 19:18:38 +02:00
William Lallemand
9c0c05b7ba BUG/MINOR: ssl/ckch: always ha_freearray() the previous entry during parsing
The ckch_conf_parse() function is the generic function which parses
crt-store keywords from the crt-store section, and also from a
crt-list.

When having multiple time the same keyword, a leak of the previous
value happens. This patch ensure that the previous value is always
freed before overwriting it.

This is the same problem as the previous "BUG/MINOR: ssl/ckch: always
free() the previous entry during parsing" patch, however this one
applies on PARSE_TYPE_ARRAY_SUBSTR.

No backport needed.
2025-05-09 19:16:02 +02:00
William Lallemand
96b1f1fd26 MINOR: tools: ha_freearray() frees an array of string
ha_freearray() is a new function which free() an array of strings
terminated by a NULL entry.

The pointer to the array will be free and set to NULL.
2025-05-09 19:12:05 +02:00
William Lallemand
311e0aa5c7 BUG/MINOR: ssl/ckch: always free() the previous entry during parsing
The ckch_conf_parse() function is the generic function which parses
crt-store keywords from the crt-store section, and also from a crt-list.

When having multiple time the same keyword, a leak of the previous value
happens. This patch ensure that the previous value is always freed
before overwriting it.

This patch should be backported as far as 3.0.
2025-05-09 19:01:28 +02:00
William Lallemand
9ce3fb35a2 BUG/MINOR: ssl: prevent multiple 'crt' on the same ssl-f-use line
The 'ssl-f-use' implementation doesn't prevent to have multiple time the
'crt' keyword, which overwrite the previous value. Letting users think
that is it possible to use multiple certificates on the same line, which
is not the case.

This patch emits an alert when setting the 'crt' keyword multiple times
on the same ssl-f-use line.

Should fix issue #2966.

No backport needed.
2025-05-09 18:52:09 +02:00
William Lallemand
0c4abf5a22 BUG/MINOR: ssl: doesn't fill conf->crt with first arg
Commit c7f29afc ("MEDIUM: ssl: replace "crt" lines by "ssl-f-use"
lines") forgot to remove an the allocation of the crt field which was
done with the first argument.

Since ssl-f-use takes keywords, this would put the first keyword in
"crt" instead of the certificate name.
2025-05-09 18:23:06 +02:00
Willy Tarreau
8a96216847 MEDIUM: sock-inet: re-check IPv6 connectivity every 30s
IPv6 connectivity might start off (e.g. network not fully up when
haproxy starts), so for features like resolvers, it would be nice to
periodically recheck.

With this change, instead of having the resolvers code rely on a variable
indicating connectivity, it will now call a function that will check for
how long a connectivity check hasn't been run, and will perform a new one
if needed. The age was set to 30s which seems reasonable considering that
the DNS will cache results anyway. There's no saving in spacing it more
since the syscall is very check (just a connect() without any packet being
emitted).

The variables remain exported so that we could present them in show info
or anywhere else.

This way, "dns-accept-family auto" will now stay up to date. Warning
though, it does perform some caching so even with a refreshed IPv6
connectivity, an older record may be returned anyway.
2025-05-09 15:45:44 +02:00
Willy Tarreau
1404f6fb7b DEBUG: pools: add a new integrity mode "backup" to copy the released area
This way we can preserve the entire contents of the released area for
later inspection. This automatically enables comparison at reallocation
time as well (like "integrity" does). If used in combination with
integrity, the comparison is disabled but the check of non-corruption
of the area mangled by integrity is still operated.
2025-05-09 14:57:00 +02:00
William Lallemand
e7574cd5f0 MINOR: acme: add the global option 'acme.scheduler'
The automatic scheduler is useful but sometimes you don't want to use,
or schedule manually.

This patch adds an 'acme.scheduler' option in the global section, which
can be set to either 'auto' or 'off'. (auto is the default value)

This also change the ouput of the 'acme status' command so it does not
shows scheduled values. The state will be 'Stopped' instead of
'Scheduled'.
2025-05-09 14:00:39 +02:00
Willy Tarreau
0ae14beb2a DEBUG: pool: permit per-pool UAF configuration
The new MEM_F_UAF flag can be set just after a pool's creation to make
this pool UAF for debugging purposes. This allows to maintain a better
overall performance required to reproduce issues while still having a
chance to catch UAF. It will only be used by developers who will manually
add it to areas worth being inspected, though.
2025-05-09 13:59:02 +02:00
Amaury Denoyelle
14e4f2b811 BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference
Emission of flow-control frames have been recently modified. Now, each
frame is sent one by one, via a single entry list. If a failure occurs,
emission is interrupted and frame is reinserted into the original
<qcc.lfctl.frms> list.

This code is incorrect as it only checks if qcc_send_frames() returns an
error code to perform the reinsert operation. However, an error here
does not always mean that the frame was not properly emitted by lower
quic-conn layer. As such, an extra test LIST_ISEMPTY() must be performed
prior to reinsert the frame.

This bug would cause a heap overflow. Indeed, the reinsert frame would
be a random value. A crash would occur as soon as it would be
dereferenced via <qcc.lfctl.frms> list.

This was reproduced by issuing a POST with a big file and interrupt it
after just a few seconds. This results in a crash in about a third of
the tests. Here is an example command using ngtcp2 :

 $ ngtcp2-client -q --no-quic-dump --no-http-dump \
   -m POST -d ~/infra/html/1g 127.0.0.1 20443 "http://127.0.0.1:20443/post"

Heap overflow was detected via a BUG_ON() statement from qc_frm_free()
via qcc_release() caller :

  FATAL: bug condition "!((&((*frm)->reflist))->n == (&((*frm)->reflist)))" matched at src/quic_frame.c:1270

This does not need to be backported.
2025-05-09 11:07:11 +02:00
Willy Tarreau
3f9194bfc9 [RELEASE] Released version 3.2-dev15
Released version 3.2-dev15 with the following main changes :
    - BUG/MEDIUM: stktable: fix sc_*(<ctr>) BUG_ON() regression with ctx > 9
    - BUG/MINOR: acme/cli: don't output error on success
    - BUG/MINOR: tools: do not create an empty arg from trailing spaces
    - MEDIUM: config: warn about the consequences of empty arguments on a config line
    - MINOR: tools: make parse_line() provide hints about empty args
    - MINOR: cfgparse: visually show the input line on empty args
    - BUG/MINOR: tools: always terminate empty lines
    - BUG/MINOR: tools: make parseline report the required space for the trailing 0
    - DEBUG: threads: don't keep lock label "OTHER" in the per-thread history
    - DEBUG: threads: merge successive idempotent lock operations in history
    - DEBUG: threads: display held locks in threads dumps
    - BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends
    - Revert "BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame"
    - MINOR: acme/cli: 'acme status' show the status acme-configured certificates
    - MEDIUM: acme/ssl: remove 'acme ps' in favor of 'acme status'
    - DOC: configuration: add "acme" section to the keywords list
    - DOC: configuration: add the "crt-store" keyword
    - BUG/MAJOR: queue: lock around the call to pendconn_process_next_strm()
    - MINOR: ssl: add filename and linenum for ssl-f-use errors
    - BUG/MINOR: ssl: can't use crt-store some certificates in ssl-f-use
    - BUG/MINOR: tools: only fill first empty arg when not out of range
    - MINOR: debug: bump the dump buffer to 8kB
    - MINOR: stick-tables: add "ipv4" as an alias for the "ip" type
    - MINOR: quic: extend return value during TP parsing
    - BUG/MINOR: quic: use proper error code on missing CID in TPs
    - BUG/MINOR: quic: use proper error code on invalid server TP
    - BUG/MINOR: quic: reject retry_source_cid TP on server side
    - BUG/MINOR: quic: use proper error code on invalid received TP value
    - BUG/MINOR: quic: fix TP reject on invalid max-ack-delay
    - BUG/MINOR: quic: reject invalid max_udp_payload size
    - BUG/MEDIUM: peers: hold the refcnt until updating ts->seen
    - BUG/MEDIUM: stick-tables: close a tiny race in __stksess_kill()
    - BUG/MINOR: cli: fix too many args detection for commands
    - MINOR: server: ensure server postparse tasks are run for dynamic servers
    - BUG/MEDIUM: stick-table: always remove update before adding a new one
    - BUG/MEDIUM: quic: free stream_desc on all data acked
    - BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00
    - DOC: config: recommend disabling libc-based resolution with resolvers
2025-05-09 10:51:30 +02:00
Willy Tarreau
4e20fab7ac DOC: config: recommend disabling libc-based resolution with resolvers
Using both libc and haproxy resolvers can lead to hard to diagnose issues
when their bevahiour diverges; recommend using only one type of resolver.

Should be backported to stable versions.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg45663.html
Co-authored-by: Lukas Tribus <lukas@ltri.eu>
2025-05-09 10:31:39 +02:00