25636 Commits

Author SHA1 Message Date
Remi Tricot-Le Breton
c606ff45a0 BUG/MINOR: init: Do not close previously created fd in stdio_quiet
During init we were calling 'stdio_quiet' and passing the previously
created 'devnullfd' file descriptor. But the 'stdio_quiet' was also
closed afterwards which raised an error (EBADF).
If we keep from closing FDs that were opened outside of the
'stdio_quiet' function we will let the caller manage its FD and avoid
double close calls.

This patch can be backported to all stable branches.
2025-10-29 10:54:17 +01:00
Huangbin Zhan
ad9a24ee55 MINOR: http: fix 405,431,501 default errorfile
A few typos were present in the default errorfiles for the status codes
above (missing dot at the end of the sentence, extra closing bracket).
This fixes them. This can be backported.
2025-10-29 08:47:19 +01:00
Ilia Shipitsin
9781d91e4d CI: disable fail-fast on fedora rawhide builds
Previously builds were dependent in terms that if one fails, other are
stopped. By their nature those builds are independent, let's not to fail
them altogether
2025-10-29 08:15:01 +01:00
Willy Tarreau
18b27bfec9 MINOR: ssl-sample: add ssl_fc_early_rcvd() to detect use of early data
We currently have ssl_fc_has_early() which says that early data are still
unconfirmed by a final handshake, but nothing to see if a client has been
able to use early data at all, which is a problem because such mechanisms
generally depend on multiple factors and it's hard to know when they start
to work. This new sample fetch function will indicate that some early data
were seen over that front connection, i.e. this can be used to confirm
that at some point the client was able to push some. This is essentially
a debugging tool that has no practical use case other than debugging.
2025-10-29 08:13:29 +01:00
Willy Tarreau
765d49b680 DOC: config: slightly clarify the ssl_fc_has_early() behavior
Clarify that it's about handshake *completion*, and also mention that
the action to be used to wait for the handshake is "wait-for-handshake",
which was not mentioned.

This can be backported though it's very minor.
2025-10-29 08:13:29 +01:00
Willy Tarreau
20174ca143 DOC: config: fix confusing typo about ACL -m ("now" vs "not")
A one-letter typo in the doc update comint with commit 6ea50ba462 ("MINOR:
acl; Warn when matching method based on a suffix is overwritten") inverts
the meaning of the sentence. It was "is not allowed" and not
"is now allowed". Needs to be backported only if the commit above ever is
(unlikely).
2025-10-29 08:13:29 +01:00
Amaury Denoyelle
7f2ae10920 BUG/MINOR: acl: warn if "_sub" derivative used with an explicit match
Recently, a new warning is displayed when an ACL derivative match method
is override with another '-m' method. This is implemented via the
following patch :

  6ea50ba462692d6dcf301081f23cab3e0f6086e4
  MINOR: acl; Warn when matching method based on a suffix is overwritten

However, this warning was not reported when "_sub" suffix was specified.
Fix this by adding PAT_MATCH_SUB in the warning comparison.

No backport needed except if above commit is.
2025-10-28 11:59:32 +01:00
Remi Tricot-Le Breton
89b43740e3 BUG/MINOR: ssl: Remove unreachable code in CLI function
Remove unreachable code in 'cli_parse_show_jwt' function.

This bug was raised in GitHub #3159.
This patch does not need to be backported.
2025-10-28 10:44:51 +01:00
Remi Tricot-Le Breton
7482b6ebf0 BUG/MEDIUM: ssl: Crash because of dangling ckch_store reference in a ckch instance
When updating CAs via the CLI, we need to create new copies of all the
impacted ckch instances (as in referenced in the ckch_inst_link list of
the updated CA) in order to use them instead of the old ones once the
updated is completed. This relies on the ckch_inst_rebuild function that
would set the ckch_store field of the ckch_inst. But we forgot to also
add the newly created instances in the ckch_inst list of the
corresponding ckch_store.

When updating a certificate afterwards, we iterate over all the
instances linked in the ckch_inst list of the ckch_store (which is
missing some instances because of the previous command) and rebuild the
instances before replacing the ckch_store. The previous ckch_store,
still referenced by the dangling ckch instance then gets deleted which
means that the instance keeps a reference to a free'd object.

Then if we were to once again update the CA file, we would iterate over
the ckch instances referenced in the cafile_entry's ckch_inst_link list,
which includes the first mentioned ckch instance with the dead
ckch_store reference. This ends up crashing during the ckch_inst_rebuild
operation.

This bug was raised in GitHub #3165.
This patch should be backported to all stable branches.
2025-10-28 10:43:45 +01:00
Willy Tarreau
2d7e3ddd4a BUG/MEDIUM: cli: do not return ACKs one char at a time
Since 3.0 where the CLI started to use rcv_buf, it appears that some
external tools sending chained commands are randomly experiencing
failures. Each time this happens when the whole command is sent as a
single packet, immediately followed by a close. This is not a correct
way to use the CLI but this has been working for ages for simple
netcat-based scripts, so we should at least try to preserve this.

The cause of the failure is that the first LF that acks a command is
immediately sent back to the client and rejected due to the closed
connection. This in turn forwards the error back to the applet which
aborts its processing.

Before 3.0 the responses would be queued into the buffer, then sent
back to the channel, and would all fail at once. This changed when
snd_buf/rcv_buf were implemented because the applets are much more
responsive and since they yield between each command, they can
deliver one ACK at a time that is immediately forwarded down the
chain.

An easy way to observe the problem is to send 5 map updates, a shutdown,
and immediately close via tcploop, and in parallel run a periodic
"show map" to count the number of elements:

  $ tcploop -U /tmp/sock1 C S:"add map #0 1 1; add map #0 2 2; add map #0 3 3; add map #0 4 4; add map #0 5 5\n" F K

Before 3.0, there would always be 5 elements. Since 3.0 and before
20ec1de214 ("MAJOR: cli: Refacor parsing and execution of pipelined
commands"), almost always 2. And since that commit above in 3.2, almost
always one. Doing the same using socat or netcat shows almost always 5...
It's entirely timing-dependent, and might even vary based on the RTT
between the client and haproxy!

The approach taken here consists in doing the same principle as MSG_MORE
or Nagle but on the response buffer: the applet doesn't need to send a
single ACK for each command when it has already been woken up and is
scheduled to come back to work. It's fine (and even desirable) that
ACKs are grouped in a single packet as much as possible.

For this reason, this patch implements APPCTX_CLI_ST1_YIELD, a new CLI
flag which indicates that the applet left in yielding condition, i.e.
it has not finished its work. This flag is used by .rcv_buf to hold
pending data. This way we won't return partial responses for no reason,
and we can continue to emulate the previous behavior.

One very nice benefit to this is that it saves huge amounts of CPU on
the client. In the test below that tries to update 1M map entries, the
CPU used by socat went from 100% to 0% and the total transfer time
dropped by 28%:

  before:
    $ time awk 'BEGIN{ printf "prompt i\n"; for (i=0;i<1000000;i++) { \
         printf "add map #0 %d %d\n",i,i,i }}' | socat /tmp/sock1 - >/dev/null

    real    0m2.407s
    user    0m1.485s
    sys     0m1.682s

  after:
    $ time awk 'BEGIN{ printf "prompt i\n"; for (i=0;i<1000000;i++) { \
         printf "add map #0 %d %d\n",i,i,i }}' | socat /tmp/sock1 - >/dev/null

    real    0m1.721s
    user    0m0.952s
    sys     0m0.057s

The difference is also quite visible on the number of syscalls during
the test (for 1k updates):

  before:
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.071691           0    100001           sendmsg

  after:
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.000011           1         9           sendmsg

This patch will need to be backported to 3.0, and depends on these two
patches to be backported as well:

    MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty
    MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()
2025-10-27 16:57:07 +01:00
Willy Tarreau
f38ea2731b MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()
This is in preparation for a future fix. For now it's simply a pure
copy of the original function, but dedicated to the CLI. It will
have to be backported to 3.0.
2025-10-27 16:57:07 +01:00
Willy Tarreau
35106d65fb MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty
appctx_rcv_buf() prepares all the work to schedule the transfers between
the applet and the channel, and it takes care of setting the various flags
that indicate what condition is blocking the transfer from progressing.

There is one limitation though. In case an applet refrains from sending
data (e.g. rate-limited, prefers to aggregate blocks etc), it will leave
a possibly empty channel buffer, and keep some data in its outbuf. The
data in its outbuf will be seen by the function above as an indication
of a channel full condition, so it will place SE_FL_WANT_ROOM. But later,
sc_applet_recv() will see this flag with a possibly empty channel, and
will rightfully trigger a BUG_ON().

appctx_rcv_buf() should be more accurate in fact. It should only set
SE_FL_RCV_MORE when more data are present in the applet, then it should
either set or clear SE_FL_WANT_ROOM dependingon whether the channel is
empty or not.

Right now it doesn't seem possible to trigger this condition in the
current state of applets, but this will become possible with a future
bugfix that will have to be backported, so this patch will need to be
backported to 3.0.
2025-10-27 16:57:07 +01:00
Olivier Houchard
259b1e1c18 MEDIUM: quic: Fix build with openssl-compat
As the QUIC options have been split into backend and frontend, there is
no more GTUNE_QUIC_LISTEN_OFF to be found in global.tune.options, look
for QUIC_TUNE_FE_LISTEN_OFF in quic_tune.fe instead.
This should fix the build with USE_QUIC and USE_QUIC_OPENSSL_COMPAT.
2025-10-24 13:51:15 +02:00
Olivier Houchard
837351245a BUG/MEDIUM: mt_list: Use atomic operations to prevent compiler optims
As a folow-up to f40f5401b9f24becc6fdd2e77d4f4578bbecae7f, explicitely
use atomic operations to set the prev and next fields, to make sure the
compiler can't assume anything about it, and just does it.

This should be backported after f40f5401b9 up to 2.8.
2025-10-24 13:34:41 +02:00
Willy Tarreau
2ec6df59bf BUILD: openssl-compat: fix build failure with OPENSSL=0 and KTLS=1
The USE_KTLS test is currently being done outside of the USE_OPENSSL
guard so disabling USE_OPENSSL still results in build failures on
libcs built with support for kernels before 4.17, because we enable
KTLS by default on linux. Let's move the KTLS block inside the
USE_OPENSSL guard instead.

No backport is needed since KTLS is only in 3.3.
2025-10-24 10:45:02 +02:00
Willy Tarreau
1824079fca BUG/MINOR: stick-tables: properly index string-type keys
This is one of the rare pleasant surprises of fixing an almost 16-years
old bug that remained unnoticed since the feature was implemented. In
1.4-dev7, commit 3bd697e071 ("[MEDIUM] Add stick table (persistence)
management functions and types") introduced stick-tables with multiple
key types, including strings, IP addresses and integers. Entries are
coded in binary and their binary representation is indexed. A special
case was made for strings in order to index them as zero-terminated
strings. However, there's one subtlety. While strings indeed have a
zero appended, they're still indexed using ebmb_insert(), which means
that all the bytes till the configured size are indexed as well. And
while these bytes generally come from a temporary storage that often
contains zeroes, or that is longer than the configured string length
and will result in truncation, it's not always the case and certain
traffic patterns with certain configurations manage to occasionally
present unpadded strings resulting in apparent duplicate keys appearing
in the dump, as shown in GH issue #3161. It seems to be essentially
reproducible at boot, and not to be particularly affected by mixed
patterns. These keys are in fact not exact duplicates in memory, but
everywhere they're used (including during synchronization), they are
equal.

What's interesting is that when this happens, one key can be presented
to a peer with its own data and will be indexed as the only one, possibly
replacing contents from the previous key, which might replace them again
later once updated in turn. This is visible in the dump of the issue
above, where key "localhost:8001" was split into two entries, one with a
request count of one and the other with a request count of 499999, and
indeed, all peers see only that last value, which overwrote the first
one.

This fix must be backported to all stable branches. Special kudos to
Mark Wort for undelining that one.
2025-10-24 10:15:11 +02:00
Aurelien DARRAGON
d655ed5f14 BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt)
This is a second attempt at fixing issues on 32bits systems which would
trigger the following BUG_ON() statement:

 FATAL: bug condition "sizeof(struct shm_stats_file_object) != 544" matched at src/stats-file.c:825 shm_stats_file_object struct size changed, is is part of the exported API: ensure all precautions were taken (ie: shm_stats_file version change) before adjusting this

This is a drop-in replacement for d30b88a6c + 4693ee0ff, as suggested by
Willy.

Indeed, on supported platforms unsigned int can be assumed to be 4 bytes
long, and long can be assumed to be 8 bytes long. As such, the previous
attempt was overkill and added unecessary maintenance complexity which
could result in bugs if not used properly. Moreover, it would only
partially solve the issue, since on little endian vs big endian
architectures, the provisioned memory areas (originating from the same
shm stats file) could be read differently by the host.

Instead we fix the aligments issues, and this alone helps to ensure
struct memory consistency on 64 vs 32bits platforms. It was tested
on both i386 and i586.

last_change and last_sess counters are now stored as unsigned int, as
it helped to fix the alignment issues and they were found to be used
as 32bits integers anyway.

Thanks to Willy for problem analysis and the patch proposal.

No backport needed.
2025-10-24 09:35:38 +02:00
Aurelien DARRAGON
a931779dde Revert "MINOR: compiler: add FIXED_SIZE(size, type, name) macro"
This reverts commit 466a603b59ed77e9787398ecf1baf77c46ae57b1.
Due to the last 2 commits, this macro is now unused, and will probably
never be used, so let's get rid of that for now.
2025-10-24 09:35:34 +02:00
Aurelien DARRAGON
8277f891d2 Revert "MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct"
This reverts commit 4693ee0ff7a5fa4a12ff69b1a33adca142e781ac.
As discussed in GH #3168, this works but it is not the proper way to fix
the issue. See following commits.
2025-10-24 09:35:29 +02:00
Aurelien DARRAGON
c0d952ccc1 Revert "BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency"
This reverts commit d30b88a6cc47d662e92b524ad5818be312401d0e.
As discussed in GH #3168, this works but it is not the proper way to fix
the issue. See following commits.
2025-10-24 09:35:25 +02:00
Christopher Faulet
854888497e BUG/MEDIUM: applet: Improve again spinning loops detection with the new API
A first attempt to fix this issue was already pushed (54b7539d6 "BUG/MEDIUM:
apppet: Improve spinning loop detection with the new API"). But it not was
fully accurrate. Indeed, we must check if something was received or sent by
the applet before incrementing the call rate. But we must also take care the
applet is allowed to receive or send data. That is what is performed in this
patch.

This patch must be backported as far as 3.0 with the patch above.
2025-10-24 09:26:10 +02:00
Amaury Denoyelle
7ba4b0ad5f BUG/MINOR: quic: rename and duplicate stream settings
Several settings can be set to control stream multiplexing and
associated receive window. Previously, all of these settings were
configured using prefix "tune.quic.frontend.", despite being applied
blindly on both sides.

Fix this by duplicating these settings specific to frontend and backend
side. Options are also renamed to use the standardize prefix
"tune.quic.[be|fe].stream." notation.

Also, each option is individually renamed to better reflect its purpose
and hide technical details relative to QUIC transport parameter naming :
* max-data-size -> stream.rxbuf
* max-streams-bidi -> stream.max-concurrent
* stream-data-ratio -> stream.data-ratio

No need to backport.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
d5142706f8 BUG/MINOR: quic: split option for congestion max window size 2025-10-23 16:49:20 +02:00
Amaury Denoyelle
33afba0dda BUG/MINOR: quic: split max-idle-timeout option for FE/BE usage
Streamline max-idle-timeout option. Rename it to use the newer cohesive
naming scheme 'tune.quic.fe|be.'.

Two different fields were already defined in global struct. These fields
are moved into quic_tune along with other QUIC settings. However, no
parser was defined for backend option, this commit fixes this.

No need to backport this.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
5bc659a4a2 MINOR: quic: rename frontend sock-per-conn setting
On frontend side, a quic_conn can have a dedicated FD or use the
listener one. These different modes can be activated via a global QUIC
tune setting.

This patch adjusts the option. First, it is renamed to the more
meaningful name 'tune.quic.fe.sock-per-conn'. Also, arguments are now
either 'default-on' or 'force-off'. The objective is to better highlight
reliationship with 'quic-socket' bind option.

The older option is deprecated and will be removed in 3.5.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
a14c6cee17 MINOR: quic: rename retry-threshold setting
A QUIC global tune setting is defined to be able to force Retry emission
prior to handshake. By definition, this ability is only supported by
QUIC servers, hence it is a frontend option only.

Rename the option to use "fe" prefix. The old option name is deprecated
and will be removed in 3.5
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
d248c5bd21 MINOR: quic: rename max Tx mem setting
QUIC global memory can be limited across the entire process via a global
tune setting. Previously, this setting used to misleading "frontend"
prefix. As this is applied as a sum between all QUIC connections, both
from frontend and backend sides, remove the prefix. The new option name
is "tune.quic.mem.tx-max".

The older option name is deprecated and will be removed in 3.5.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
9bfe9b9e21 MINOR: quic: split Tx options for FE/BE usage
This patch is similar to the previous one, except that it is focused on
Tx QUIC settings. It is now possible to toggle GSO and pacing on
frontend and backend sides independently.

As with previous patch, option are renamed to use "fe/be" unified
prefixes. This is part of the current serie of commits which unify QUI
settings. Older options are deprecated and will be removed on 3.5
release.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
33a8cb87a9 MINOR: quic: split congestion controler options for FE/BE usage
Various settings can be configured related to QUIC congestion controler.
This patch duplicates them to be able to set independent values on
frontend and backend sides.

As with previous patch, option are renamed to use "fe/be" unified
prefixes. This is part of the current serie of commits which unify QUIC
settings. Older options are deprecated and will be removed on 3.5
release.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
7640e9a9ee MINOR: quic: duplicate glitches FE option on BE side
Previously, QUIC glitches support was only implemented for frontend
side. Extend this so that the option can be specified separately both on
frontend and backend sides. Function _qcc_report_glitch() now retrieves
the relevant max value based on connection side.

In addition to this, option has been renamed to use "fe/be" prefixes.
This is part of the current serie of commits which unify QUIC settings.
Older options are deprecated and will be removed on 3.5 release.
2025-10-23 16:49:20 +02:00
Amaury Denoyelle
b34cd0b506 MINOR: quic: rename "no-quic" to "tune.quic.listen"
Rename the option to quickly enable/disable every QUIC listeners. It now
takes an argument on/off. The documentation is extended to reflect the
fact that QUIC backend are not impacted by this option.

The older keyword is simply removed. Deprecation is considered
unnecessary as this setting is only useful during debugging.
2025-10-23 16:47:58 +02:00
Amaury Denoyelle
42e5ec6519 MINOR: quic: prepare support for options on FE/BE side
A major reorganization of QUIC settings is going to be performed. One of
its objective is to clearly define options which can be separately
configured on frontend and backend proxy sides.

To implement this, quic_tune structure is extended to support fe and be
options. A set of macros/functions is also defined : it allows to
retrieve an option defined on both sides with unified code, based on
proxy side of a quic_conn/connection instance.
2025-10-23 15:06:01 +02:00
Amaury Denoyelle
cf3cf7bdda MINOR: quic: remove unused conn-tx-buffers limit keyword
Remove parsing code for tune.quic.frontend.conn-tx-buffers.limit. This
option was deprecated for some time and in fact was noop and not
mentionned anymore in the documentation.
2025-10-23 15:06:01 +02:00
Olivier Houchard
f40f5401b9 BUG/MEDIUM: mt_lists: Avoid el->prev = el->next = el
Avoid setting both el->prev and el->next on the same line.
The goal is to set both el->prev and el->next to el, but a naive
compiler, such as when we're using -O0, will set el->next first, then
will set el->prev to the value of el->next, but if we're unlucky,
el->next will have been set to something else by another thread.
So explicitely set both to what we want.

This should be backported up to 2.8.
2025-10-23 14:43:51 +02:00
William Lallemand
d0f9515e5c MINOR: acme: display the complete challenge_ready command in the logs
When using a wildcard DNS domain in the ACME configuration, for example
*.example.com, one might think that it needs to use the challenge_ready
command with this domain. But that's not the case, the challenge_ready
command takes the domain asked by the ACME server, which is stripped of
the wildcard.

In order to be clearer, the log message shows exactly the command the
user should sent, which is clearer.
2025-10-23 11:14:07 +02:00
William Lallemand
861fe53204 MINOR: acme: add the dns-01-record field to the sink
The dns-01-record field in the dpapi sink, output the authentication
token which is needed in the TXT record in order to validate the DNS-01
challenge.
2025-10-23 11:14:07 +02:00
Olivier Houchard
dfe866fa98 BUG/MEDIUM: stick-tables: Don't loop if there's nothing left
Before waking up the expiration task again at the end of it, make sure
the next date is set. If there's nothing left to do, then task_exp will
be TASK_ETERNITY and we then don't want to be waken up again.
2025-10-23 10:51:52 +02:00
Willy Tarreau
871c80505c BUG/MEDIUM: build: limit excessive and counter-productive gcc-15 vectorization
In https://bugs.gentoo.org/964719, Dan Goodliffe reported that using
CFLAGS="-O3 -march=westmere" creates a binary that segfaults on startup
with gcc-15. This could be reproduced here, is isolated to gcc-15 and
-O3, and is caused by gcc emitting "movdqa" instructions to read unaligned
longs taken from chars that were carefully isolated within ifdefs checking
for support for unaligned integers on the platform...

Some experiments showed that changing all casts all over the code using
either typedef-enforced align(1) or using the packed union trick does
the job, it needs a more in-depth validation since it's obvious that
it doesn't produce the same code at all (at least on more modern
machines).

However, the offending optimization option could be isolated, it's
"-fvect-cost-model=dynamic" which causes this, while -O2 uses
"-fvect-cost-model=very-cheap". Turning it back to very-cheap solves the
issue, reduces the code, and yields an extra 5% performance increase on
the http-request rate (181k vs 172k on a single core)! This could at
least partially explain why it has been observed several times over
the last few years that -O3 yields bigger and slower code than -O2.

It was also verified that the option doesn't change the emitted code
at -O0..-O2,-Os,-Oz, but only at -O3.

This patch detects the presence of this option and turns it on to
address the problem that some distros are facing after an upgrade to
gcc-15. As such it should be backported to recent LTS and stable
branches. Here, 3.1 was used, so it seems legit to at least target
the last two LTS branches (i.e. go as far as 3.0).

Thanks to Dan Goodliffe for sharing a working reproducer, Sam James
for starting the investigations and Christian Ruppert for bringing
the issue to us.
2025-10-23 10:06:52 +02:00
Aurelien DARRAGON
d30b88a6cc BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency
As reported by @tianon on GH #3168, running haproxy on 32bits i386
platform would trigger the following BUG_ON() statement:

 FATAL: bug condition "sizeof(struct shm_stats_file_object) != 544" matched at src/stats-file.c:825
shm_stats_file_object struct size changed, is is part of the exported API: ensure all precautions were taken (ie: shm_stats_file version change) before adjusting this

In fact, some efforts were already taken to ensure shm_stats_file_object
struct size remains consistent on 64 vs 32 bits platforms, since
shm_stats_file_object is part of the public API and directly exposed in
the stats file.

However, some parts were overlooked: some structs that are embedded in
shm_stats_file_object struct itself weren't using fixed-width integers,
and would sometime be unaligned. The result of this is that it was
up to the compiler (platform-dependent) to choose how to deal with such
ambiguities, which could cause the struct mapping/size to be inconsistent
from one platform to another.

Hopefully this was caught by the BUG_ON() statement and with the precious
help of @tianon

To fix this, we now use fixed-width integers everywhere for members
(and submembers) of shm_stats_file_object struct, and we use explicit
padding where missing to avoid automatic padding when we don't expect
one. As for the previous commit, we leverage FIXED_SIZE() and
FIXED_SIZE_ARRAY() macro to set the expected width for each integer
without causing build issues on platform that don't support larger
integers.

No backport needed, this feature was introduced during 3.3-dev.
2025-10-22 20:52:22 +02:00
Aurelien DARRAGON
4693ee0ff7 MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct
freq-ctr struct is used by the shm_stats_file API, and more precisely,
it is used in the shm_stats_file_object struct for counters.

shm_stats_file_object struct requires to be plateform-independent, thus
we switch to using explicit size types (AKA fixed width integer types)
for freq-ctr, in the attempt to make freq-ctr size and memory mapping
consistent from one platform to another.

We cannot simply use fixed-width integer because some of them are
involved in atomic operations, and forcing a given width could
cause build issues on some platforms where atomic ops are not
implemented for large integers. Instead we leverage the FIXED_SIZE
macro to keep handling the integers as before, but forcing them to
be stored using expected number of bytes (unused bytes will simply
be ignored).

No change of behavior should be expected.
2025-10-22 20:52:18 +02:00
Aurelien DARRAGON
466a603b59 MINOR: compiler: add FIXED_SIZE(size, type, name) macro
FIXED_SIZE() macro can be used to instruct the compiler that the struct
member named <name>, handled as <type>, must be stored using <size> bytes
and that even if the type used is actualler smaller than the expected size

FIXED_SIZE_ARRAY(), similar to FIXED_SIZE() but for arrays: it takes an
extra argument which is the number of members.

They may be used for portability concerns to ensure a structure mapping
remains consistent between platforms.
2025-10-22 20:52:12 +02:00
Aurelien DARRAGON
1e4dbebef2 MINOR: stats-file: fix typo in shm-stats-file object struct size detection
As reported by @TimWolla on GH #3168, there was a typo in shm stats file
BUG_ON to report that the size of shm_stats_file_object changed.

No backport needed.
2025-10-22 20:52:08 +02:00
Amaury Denoyelle
f50425c021 MINOR: quic: remove received CRYPTO temporary tree storage
The previous commit switch from ncbuf to ncbmbuf as storage for received
CRYPTO frames. The latter ensures that buffering of such frames cannot
fail anymore due to gaps size.

Previously, extra mechanism were implemented on QUIC frames parsing
function to overcome the limitation of ncbuf on gaps size. Before
insertion, CRYPTO frames were stored in a temporary tree to order their
insertion. As this is not necessary anymore, this commit removes the
temporary tree insertion.

This commit is closely associated to the previous bug fix. As it
provides a neat optimization and code simplication, it can be backported
with it, but not in the next immediate release to spot potential
regression.
2025-10-22 15:24:02 +02:00
Amaury Denoyelle
4c11206395 BUG/MAJOR: quic: use ncbmbuf for CRYPTO handling
In QUIC, TLS handshake messages such as ClientHello are encapsulated in
CRYPTO frames. Each QUIC implementation can split the content in several
frames of random sizes. In fact, this feature is now used by several
clients, based on chrome so-called "Chaos protection" mechanism :

https://quiche.googlesource.com/quiche/+/cb6b51054274cb2c939264faf34a1776e0a5bab7

To support this, haproxy uses a ncbuf storage to store received CRYPTO
frames before passing it to the SSL library. However, this storage
suffers from a limitation as gaps between two filled blocks cannot be
smaller than 8 bytes. Thus, depending on the size of received CRYPTO
frames and their order, ncbuf may not be sufficient. Over time, several
mechanisms were implemented in haproxy QUIC frames parsing to overcome
the ncbuf limitation.

However, reports recently highlight that with some clients haproxy is
not able to deal with CRYPTO frames reception. In particular, this is
the case with the latest ngtcp2 release, which implements a similar
chaos protection mechanism via the following patch. It also seems that
this impacts haproxy interaction with firefox.

commit 89c29fd8611d5e6d2f6b1f475c5e3494c376028c
Author: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
Date:   Mon Aug 4 22:48:06 2025 +0900

    Crumble Client Initial CRYPTO (aka chaos protection)

To fix haproxy CRYPTO frames buffering once and for all, an alternative
non-contiguous buffer named ncbmbuf has been recently implemented. This
type does not suffer from gaps size limitation, albeit at the cost of a
small reduction in the size available for data storage.

Thus, the purpose of this current patch is to replace ncbuf with the
newer ncbmbuf for QUIC CRYPTO frames parsing. Now, ncbmb_add() is used
to buffer received frames which is guaranteed to suceed. The only
remaining case of error is if a received frame offset and length exceed
the ncbmbuf data storage, which would result in a CRYPTO_BUFFER_EXCEEDED
error code.

A notable behavior change when switching to ncbmbuf implementation is
that NCB_ADD_COMPARE mode cannot be used anymore during add. Instead,
crypto frame content received at a similar offset will be overwritten.

A final note regarding STREAM frames parsing. For now, it is considered
unnecessary to switch from ncbuf in this case. Indeed, QUIC clients does
not perform aggressive fragmentation for them. Keeping ncbuf ensure that
the data storage size is bigger than the equivalent ncbmbuf area.

This should fix github issue #3141.

This patch must be backported up to 2.6. It is first necessary to pick
the relevant commits for ncbmbuf implementation prior to it.
2025-10-22 15:04:41 +02:00
Amaury Denoyelle
25e378fa65 MINOR: ncbmbuf: add tests as standalone mode
Write some tests for ncbmbuf buf. These tests should be run each time
ncbmbuf implementation is adjusted. Use the following command :

$ gcc -g -DSTANDALONE -I./include -o ncbmbuf src/ncbmbuf.c && ./ncbmbuf

As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
2025-10-22 15:04:24 +02:00
Amaury Denoyelle
8b8ab2824e MINOR: ncbmbuf: implement advance operation
Implement ncbmb_advance() function for the ncbmbuf type. This allows to
remove bytes in front of the buffer, regardless of the existing gaps.
This is implemented by resetting the corresponding bits of the bitmap.

As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
2025-10-22 15:04:06 +02:00
Amaury Denoyelle
42c495f3d7 MINOR: ncbmbuf: implement ncbmb_data()
Implement ncbmb_data() function for the ncbmbuf type. Its purpose is
similar to its ncbuf counterpart : it returns the size in bytes of data
starting at a specific offset until the next gap.

As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
2025-10-22 15:04:06 +02:00
Amaury Denoyelle
db4a68752d MINOR: ncbmbuf: implement iterator bitmap utilities functions
Extend private API for ncbmbuf type by defining an iterator type for the
buffer bitmap handling. The purpose is to provide a simple method to
iterate over the bitmap one byte at a time, with a proper bitmask set to
hide irrelevant bits.

This internal type is unused for now, but will become useful when
implementing ncb_data() and ncb_advance() functions.

As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
2025-10-22 15:04:06 +02:00
Amaury Denoyelle
1e1a3aa6aa MINOR: ncbmbuf: implement add
This patch implements add operation for ncbmbuf type.

This function is simpler than its ncbuf counterpart. Indeed, for now
only NCB_ADD_OVERWRT mode is supported. This compromise has been chosen
as ncbmbuf will be first used for QUIC CRYPTO frames handling, which
does not mandate to compare existing filled blocks during insertion.

As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
2025-10-22 15:04:06 +02:00
Amaury Denoyelle
b9f91ad3ff MINOR: ncbmbuf: define new ncbmbuf type
Define ncbmbuf which is an alternative non-contiguous buffer
implementation. "bm" abbreviation stands for bitmap, which reflects how
gaps and filled blocks are encoded. The main purpose of this
implementation is to get rid of the ncbuf limitation regarding the
minimal size for gaps between two blocks of data.

This commit adds the new module ncbmbuf. Along with it, some utility
functions such as ncbmb_make(), ncbmb_init() and ncbmb_is_empty() are
defined. Public API of ncbmbuf will be extended in the following
patches.

This patch is not considered a bug fix. However, it will be required to
fix issue encountered on QUIC CRYPTO frames parsing. Thus, it will be
necessary to backport the current patch prior to the fix to come.
2025-10-22 15:04:06 +02:00