Commit Graph

24126 Commits

Author SHA1 Message Date
Willy Tarreau
05a4efb102 MINOR: thread: rely on the cpuset functions to count bound CPUs
let's just clean up the thread_cpus_enabled() code a little bit
by removing the OS-specific code and rely on ha_cpuset_detect_bound()
instead. On macos we continue to use sysconf() for now.
2025-03-14 18:30:30 +01:00
Willy Tarreau
32bb68e736 MINOR: cpuset: make the API support negative CPU IDs
Negative IDs are very convenient to mean "not set", so let's just make
the cpuset API robust against this, especially with ha_cpuset_isset()
so that we don't have to manually add this check everywhere when a
value is not known.
2025-03-14 18:30:30 +01:00
Willy Tarreau
f156baf8ce DOC: design-thoughts: commit numa-auto.txt
Lots of collected data and observations aggregated into a single commit
so as not to lose them. Some parts below come from several commit
messages and are incremental.

Add captures and analysis of intel 14900 where it's not easy to draw
the line between the desired P and E cores.

The 14900 raises some questions (imagine a dual-die variant in multi-socket).
That's the start of an algorithmic distribution of performance cores into
thread groups.

cpu-map currently conflicts a lot with the choices after auto-detection
but it doesn't have to. The problem is the inability to configure the
threads for the whole process like taskset does. By offering this ability
we can also start to designate groups of CPUs symbolically (package, die,
ccx, cores, smt).

It can also be useful to exploit the info from cpuinfo that is not
available in /sys, such as the model number. At least on arm, higher
numbers indicate bigger cores and can be useful to distinguish cores
inside a cluster. It will not indicate big vs medium ones of the same
type (e.g. a78 3.0 vs 2.4 GHz) but can still be effective at identifying
the efficient ones.

In short, infos such as cluster ID not always reliable, and are
local to the package. die_id as well. die number is not reported
here but should definitely be used, as a higher priority than L3.

We're still missing a discriminant between the l3 and cluster number
in order to address heterogenous CPUs (e.g. intel 14900), though in
terms of locality that's currently done correctly.

CPU selection is also a full topic, and some thoughts were noted
regarding sorting by perf vs locality so as never to mix inter-
socket CPUs due to sorting.

The proposed cpu-selection cannot work as-is, because it acts both on
restriction and preference, and these two are not actions but a sequence.
First restrictions must be enforced, and second the remaining CPUs are
sorted according to the preferred criterion, and a number of threads are
selected.

Currently we refine the OS-exposed cluster number but it's not correct
as we can end up with something poorly numbered. We need to respect the
LLC in any case so let's explain the approach.
2025-03-14 18:30:30 +01:00
Willy Tarreau
0ceb1f2c51 DEV: ncpu: also emulate sysconf() for _SC_NPROCESSORS_*
This is also needed in order to make the requested number of CPUs
appear. For now we don't reroute to the original sysconf() call so
we return -1,EINVAL for all other info.
2025-03-14 18:30:30 +01:00
Willy Tarreau
ed75148ca0 BUILD: tools: avoid a build warning on gcc-4.8 in resolve_sym_name()
A build warning is emitted with gcc-4.8 in tools.c since commit
e920d73f59 ("MINOR: tools: improve symbol resolution without dl_addr")
because the compiler doesn't see that <size> is necessarily initialized.
Let's just preset it.
2025-03-14 18:30:30 +01:00
Willy Tarreau
4e09789644 MINOR: tools: teach resolve_sym_name() a few more common symbols
This adds run_poll_loop, run_tasks_from_lists, process_runnable_tasks,
ha_dump_backtrace and cli_io_handler which are fairly common in
backtraces. This will be less relative symbols when dladdr is not
usable.
2025-03-13 17:31:16 +01:00
Willy Tarreau
a3582a77f7 MINOR: tools: ease the declaration of known symbols in resolve_sym_name()
Let's have a macro that declares both the symbol and its name, it will
avoid the risk of introducing typos, and encourages adding more when
needed. The macro also takes an optional second argument to permit an
inline declaration of an extern symbol.
2025-03-13 17:30:48 +01:00
Willy Tarreau
e920d73f59 MINOR: tools: improve symbol resolution without dl_addr
When dl_addr is not usable or fails, better fall back to the closest
symbol among the known ones instead of providing everything relative
to main. Most often, the location of the function will give some hints
about what it can be. Thus now we can emit fct+0xXXX in addition to
main+0xXXX or main-0xXXX. We keep a margin of +256kB maximum after a
function for a match, which is around the maximum size met in an object
file, otherwise it becomes pointless again.
2025-03-13 17:30:48 +01:00
Willy Tarreau
1e99efccef MINOR: cli: export cli_io_handler() to ease symbol resolution
It's common to meet this function in backtraces, it's a bit annoying
that it's not resolved, so let's export it so that it becomes resolvable.
2025-03-13 17:30:48 +01:00
Aurelien DARRAGON
8311be5ac6 BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics
Performing a diff on stats output before vs after commit 66152526
("MEDIUM: stats: convert counters to new column definition") revealed
that some metrics were not properly ported to to the new API. Namely,
"lbtot", "cli_abrt" and "srv_abrt" are now exposed on frontend and
listeners while it was not the case before.

Also, "hrsp_other" is exposed even when "mode http" wasn't set on the
proxy.

In this patch we restore original behavior by fixing the capabilities
and hide settings.

As this could be considered as a minor regression (looking at the commit
message it doesn't seem intended), better tag this as a bug. It should be
backported in 3.0 with 66152526.
2025-03-13 11:49:18 +01:00
Aurelien DARRAGON
4c3eb60e70 DOC: management: rename some last occurences from domain "dns" to "resolvers"
This is a complementary patch to cf913c2f9 ("DOC: management: rename show
stats domain cli "dns" to "resolvers"). The doc still refered to the
legacy "dns" domain filter for stat command. Let's rename those occurences
to "resolvers".

It may be backported to all stable versions.
2025-03-13 11:49:10 +01:00
Willy Tarreau
78ef52dbd1 BUILD: backend: silence a build warning when threads are disabled
Since commit 8de8ed4f48 ("MEDIUM: connections: Allow taking over
connections from other tgroups.") we got this partially absurd
build warning when disabling threads:

  src/backend.c: In function 'conn_backend_get':
  src/backend.c:1371:27: warning: array subscript [0, 0] is outside array bounds of 'struct tgroup_info[1]' [-Warray-bounds]

The reason is that gcc sees that curtgid is not equal to tgid which is
defined as 1 in this case, thus it figures that tgroup_info[curtgid-1]
will be anything but zero and that doesn't fit. It is ridiculous as it
is a perfect case of dead code elimination which should not warrant a
warning. Nevertheless we know we don't need to do this when threads are
disabled and in this case there will not be more than 1 thread group, so
we can happily use that preliminary test to help the compiler eliminate
the dead condition and avoid spitting this warning.

No backport is needed.
2025-03-12 18:16:14 +01:00
Willy Tarreau
b61ed9babe BUILD: tools: silence a build warning when USE_THREAD=0
The dladdr_lock that was added to avoid re-entering into dladdr is
conditioned by threads, but the way it's declared causes a build
warning if threads are disabled due to the insertion of a lone semi
colon in the variables block. Let's switch to __decl_thread_var()
for this.

This can be backported wherever commit eb41d768f9 ("MINOR: tools:
use only opportunistic symbols resolution") is backported. It relies
on these previous two commits:

   bb4addabb7 ("MINOR: compiler: add a simple macro to concatenate resolved strings")
   69ac4cd315 ("MINOR: compiler: add a new __decl_thread_var() macro to declare local variables")
2025-03-12 18:11:14 +01:00
Willy Tarreau
69ac4cd315 MINOR: compiler: add a new __decl_thread_var() macro to declare local variables
__decl_thread() already exists but is more suited for struct members.
When using it in a variables block, it appends the final trailing
semi-colon which is a statement that ends the variable block. Better
clean this up and have one precisely for variable blocks. In this
case we can simply define an unused enum value that will consume the
semi-colon. That's what the new macro __decl_thread_var() does.
2025-03-12 18:08:12 +01:00
Willy Tarreau
bb4addabb7 MINOR: compiler: add a simple macro to concatenate resolved strings
It's often useful to be able to concatenate strings after resolving
them (e.g. __FILE__, __LINE__ etc). Let's just have a CONCAT() macro
to do that, which calls _CONCAT() with the same arguments to make
sure the contents are resolved before being concatenated.
2025-03-12 18:06:55 +01:00
Willy Tarreau
12383fd9f5 BUG/MEDIUM: thread: use pthread_self() not ha_pthread[tid] in set_affinity
A bug was uncovered by the work on NUMA. It only triggers in the CI
with libmusl due to a race condition. What happens is that the call
to set_thread_cpu_affinity() is done very early in the polling loop,
and that it relies on ha_pthread[tid] instead of pthread_self(). The
problem is that ha_pthread[tid] is only set by the return from
pthread_create(), which might happen later depending on the number of
CPUs available to run the starting thread.

Let's just use pthread_self() here. ha_pthread[] is only used to send
signals between threads, there's no point in using it here.

This can be backported to 2.6.
2025-03-12 15:59:23 +01:00
Aurelien DARRAGON
e942305214 MEDIUM: log: change default "host" strategy for log-forward section
Historically, log-forward proxy used to preserve host field from input
message as much as possible, and if syslog host wasn't provided
(rfc5424 '-' or bad rfc3164 or rfc5424 message) then "localhost" or "-"
would be used as host when outputting message using rfc3164 or rfc5424.

We change that behavior (which corresponds to "keep" host option), so that
log-forward now uses "fill" strategy as default: if the host is provided
in input message, it is preserved. However if it is missing and IP address
from sender is available, we use it.
2025-03-12 10:55:49 +01:00
Aurelien DARRAGON
ad0133cc50 MINOR: log: handle log-forward "option host"
Following previous patch, we know implement the logic for the host
option under log-forward section. Possible strategies are:

      replace If input message already contains a value for the host
              field, we replace it by the source IP address from the
              sender.
              If input message doesn't contain a value for the host field
              (ie: '-' as input rfc5424 message or non compliant rfc3164
              or rfc5424 message), we use the source IP address from the
              sender as host field.

      fill    If input message already contains a value for the host field,
              we keep it.
              If input message doesn't contain a value for the host field
              (ie: '-' as input rfc5424 message or non compliant rfc3164
              or rfc5424 message), we use the source IP address from the
              sender as host field.

      keep    If input message already contains a value for the host field,
              we keep it.
              If input message doesn't contain a value for the host field,
              we set it to localhost (rfc3164) or '-' (rfc5424).
              (This is the default)

      append  If input message already contains a value for the host field,
              we append a comma followed by the IP address from the sender.
              If input message doesn't contain a value for the host field,
              we use the source IP address from the sender.

Default value (unchanged) is "keep" strategy. option host is only relevant
with rfc3164 or rfc5424 format on log targets. Also, if the source address
is not available (ie: UNIX socket), default behavior prevails.

Documentation was updated.
2025-03-12 10:52:07 +01:00
Aurelien DARRAGON
003fe530ae MINOR: log: add "option host" log-forward option
add only the parsing part, options are currently unused
2025-03-12 10:51:35 +01:00
Aurelien DARRAGON
47f14be9f3 MINOR: tools: only print address in sa2str() when port == -1
Support special value for port in sa2str: if port is equal to -1, only
print the address without the port, also ignoring <map_ports> value.
2025-03-12 10:51:20 +01:00
Aurelien DARRAGON
2de62d0461 MINOR: log: provide source address information in syslog_process_message()
provide struct sockaddr_storage pointer from the message sender in
syslog_process_message()
2025-03-12 10:50:30 +01:00
Aurelien DARRAGON
bc76f6dde9 MINOR: log: migrate log-forward options from proxy->options2 to options3
Migrate recently added log-forward section options, currently stored under
proxy->options2 to proxy->options3 since proxy->options2 is running out of
space and we plan on adding more log-forward options.
2025-03-12 10:50:03 +01:00
Aurelien DARRAGON
cc5a66212d MINOR: proxy: add proxy->options3
proxy->options2 is almost full, yet we will add new log-forward options
in upcoming patches so we anticipate that by adding a new {no_}options3
and cfg_opts3[] to further extend proxy options
2025-03-12 10:49:36 +01:00
Aurelien DARRAGON
d47e7103b8 CLEANUP: log: add syslog_process_message() helper
Prevent code duplication under syslog_fd_handler() and syslog_io_handler()
by merging common code path in a single syslog_process_message() helper
that processed a single message stored in <buf> according to <frontend>
settings.
2025-03-12 10:49:18 +01:00
Aurelien DARRAGON
8b8520305e CLEANUP: log-forward: remove useless options2 init
It is actually not required to zero out proxy->options2 since proxy is
allocated using calloc() which already does it.
2025-03-12 10:49:08 +01:00
William Lallemand
c6e6318125 CI: github: add "jose" to apt dependencies
jose is used in the JWS unit-test, let's add it to the CI.
2025-03-11 22:29:40 +01:00
William Lallemand
d014d7ee72 TESTS: jws: implement a test for JWS signing
This test returns a JWS payload signed a specified private key in the
PEM format, and uses the "jose" command tool to check if the signature
is correct against the jwk public key.

The test could be improved later by using the code from jwt.c allowing
to check a signature.
2025-03-11 22:29:40 +01:00
William Lallemand
3abb428fc8 MINOR: jws: implement JWS signing
This commits implement JWS signing, this is divided in 3 parts:

- jws_b64_protected() creates a JWS "protected" header, which takes the
  algorithm, kid or jwk, nonce and url as input, and fill a destination
  buffer with the base64url version of the header
- jws_b64_payload() just encode a payload in base64url
- jws_b64_signature() generates a signature using as input the protected
  header and the payload, it supports ES256, ES384 and ES512 for ECDSA
  keys, and RS256 for RSA ones. The RSA signature just use the
  EVP_DigestSign() API with its result encoded in base64url. For ECDSA
  it's a little bit more complicated, and should follow section 3.4 of
  RFC7518, R and S should be padded to byte size.

Then the JWS can be output with jws_flattened() which just formats the 3
base64url output in a JSON representation with the 3 fields, protected,
payload and signature.
2025-03-11 22:29:40 +01:00
Willy Tarreau
3cbeb6a74b [RELEASE] Released version 3.2-dev7
Released version 3.2-dev7 with the following main changes :
    - BUG/MEDIUM: applet: Don't handle EOI/EOS/ERROR is applet is waiting for room
    - BUG/MEDIUM: spoe/mux-spop: Introduce an NOOP action to deal with empty ACK
    - BUG/MINOR: cfgparse: fix NULL ptr dereference in cfg_parse_peers
    - BUG/MEDIUM: uxst: fix outgoing abns address family in connect()
    - REGTESTS: fix reg-tests/server/abnsz.vtc
    - BUG/MINOR: log: fix outgoing abns address family
    - BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers
    - MINOR: clock: always use atomic ops for global_now_ms
    - CI: QUIC Interop: clean old docker images
    - BUG/MINOR: stream: do not call co_data() from __strm_dump_to_buffer()
    - BUG/MINOR: mux-h1: always make sure h1s->sd exists in h1_dump_h1s_info()
    - MINOR: tinfo: add a new thread flag to indicate a call from a sig handler
    - BUG/MEDIUM: stream: never allocate connection addresses from signal handler
    - MINOR: freq_ctr: provide non-blocking read functions
    - BUG/MEDIUM: stream: use non-blocking freq_ctr calls from the stream dumper
    - MINOR: tools: use only opportunistic symbols resolution
    - CLEANUP: task: move the barrier after clearing th_ctx->current
    - MINOR: compression: Introduce minimum size
    - BUG/MINOR: h2: always trim leading and trailing LWS in header values
    - MINOR: tinfo: split the signal handler report flags into 3
    - BUG/MEDIUM: stream: don't use localtime in dumps from a signal handler
    - OPTIM: connection: don't try to kill other threads' connection when !shared
    - BUILD: add possibility to use different QuicTLS variants
    - MEDIUM: fd: Wait if locked in fd_grab_tgid() and fd_take_tgid().
    - MINOR: fd: Add fd_lock_tgid_cur().
    - MEDIUM: epoll: Make sure we can add a new event
    - MINOR: pollers: Add a fixup_tgid_takeover() method.
    - MEDIUM: pollers: Drop fd events after a takeover to another tgid.
    - MEDIUM: connections: Allow taking over connections from other tgroups.
    - MEDIUM: servers: Add strict-maxconn.
    - BUG/MEDIUM: server: properly initialize PROXY v2 TLVs
    - BUG/MINOR: server: fix the "server-template" prefix memory leak
    - BUG/MINOR: h3: do not report transfer as aborted on preemptive response
    - CLEANUP: h3: fix documentation of h3_rcv_buf()
    - MINOR: hq-interop: properly handle incomplete request
    - BUG/MEDIUM: mux-fcgi: Try to fully fill demux buffer on receive if not empty
    - MINOR: h1: permit to relax the websocket checks for missing mandatory headers
    - BUG/MINOR: hq-interop: fix leak in case of rcv_buf early return
    - BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer
    - MINOR: jws: implement a JWK public key converter
    - DEBUG: init: add a way to register functions for unit tests
    - TESTS: add a unit test runner in the Makefile
    - TESTS: jws: register a unittest for jwk
    - CI: github: run make unit-tests on the CI
    - TESTS: add config smoke checks in the unit tests
    - MINOR: jws: conversion to NIST curves name
    - CI: github: remove smoke tests from vtest.yml
    - TESTS: ist: fix wrong array size
    - TESTS: ist: use the exit code to return a verdict
    - TESTS: ist: add a ist.sh to launch in make unit-tests
    - CI: github: fix h2spec.config proxy names
    - DEBUG: init: Add a macro to register unit tests
    - MINOR: sample: allow custom date format in error-log-format
    - CLEANUP: log: removing "log-balance" references
    - BUG/MINOR: log: set proper smp size for balance log-hash
    - MINOR: log: use __send_log() with exact payload length
    - MEDIUM: log: postpone the decision to send or not log with empty messages
    - MINOR: proxy: make pr_mode enum bitfield compatible
    - MINOR: cfgparse-listen: add and use cfg_parse_listen_match_option() helper
    - MINOR: log: add options eval for log-forward
    - MINOR: log: detach prepare from parse message
    - MINOR: log: add dont-parse-log and assume-rfc6587-ntf options
    - BUG/MEIDUM: startup: return to initial cwd only after check_config_validity()
    - TESTS: change the output of run-unittests.sh
    - TESTS: unit-tests: store sh -x in a result file
    - CI: github: show results of the Unit tests
    - BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server
    - BUG/MINOR: cfgparse/peers: properly handle ignored local peer case
    - BUG/MINOR: server: dont return immediately from parse_server() when skipping checks
    - MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines
    - BUG/MINOR: stream: fix age calculation in "show sess" output
    - MINOR: stream/cli: rework "show sess" to better consider optional arguments
    - MINOR: stream/cli: make "show sess" support filtering on front/back/server
    - TESTS: quic: create first quic unittest
    - MINOR: h3/hq-interop: restore function for standalone FIN receive
    - MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN
    - MINOR: mux-quic: refine reception of standalone STREAM FIN
    - MINOR: mux-quic: define globally stream rxbuf size
    - MINOR: mux-quic: define rxbuf wrapper
    - MINOR: mux-quic: store QCS Rx buf in a single-entry tree
    - MINOR: mux-quic: adjust Rx data consumption API
    - MINOR: mux-quic: adapt return value of qcc_decode_qcs()
    - MAJOR: mux-quic: support multiple QCS RX buffers
    - MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf
    - MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc
    - BUG/MINOR: cfgparse-tcp: relax namespace bind check
    - MINOR: startup: adjust alert messages, when capabilities are missed
2025-03-07 16:37:57 +01:00
Valentine Krasnobaeva
7d427134fe MINOR: startup: adjust alert messages, when capabilities are missed
CAP_SYS_ADMIN support was added, in order to access sockets in namespaces. So
let's adjust the alert at startup, where we check preserved capabilities from
global.last_checks. Let's mention here cap_sys_admin as well.
2025-03-07 16:37:16 +01:00
Damien Claisse
f0a07f834c BUG/MINOR: cfgparse-tcp: relax namespace bind check
Commit 5cbb278 introduced cap_sys_admin support, and enforced checks for
both binds and servers. However, when binding into a namespace, the bind
is done before dropping privileges. Hence, checking that we have
cap_sys_admin capability set in this case is not needed (and it would
decrease security to add it).
For users starting haproxy with other user than root and without
cap_sys_admin, bind should have already failed.
As a consequence, relax runtime check for binds into a namespace.
2025-03-07 16:23:29 +01:00
Amaury Denoyelle
dc7913d814 MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc
Support for multiple Rx buffers per QCS instance has been introduced by
previous patches. However, due to flow-control initial values, client
were still unable to fully used this to increase their upload
throughput.

This patch increases max-stream-data-bidi-remote flow-control initial
values. A new define QMUX_STREAM_RX_BUF_FACTOR will fix the number of
concurrent buffers allocable per QCS. It is set to 90.

Note that connection flow-control initial value did not changed. It is
still configured to be equivalent to bufsize multiplied by the maximum
concurrent streams. This ensures that Rx buffers allocation is still
constrained per connection, so that it won't be possible to have all
active QCS instances using in parallel their maximum Rx buffers count.
2025-03-07 12:06:27 +01:00
Amaury Denoyelle
75027692a3 MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf
Previous commit introduces support for multiple Rx buffers per QCS
instance. Contiguous data may be splitted accross multiple buffers
depending on their offset.

A particular issue could arise with this new model. Indeed, app_ops
rcv_buf callback can still deal with a single buffer at a time. This may
cause a deadlock in decoding if app_ops layer cannot proceed due to
partial data, but such data are precisely divided on two buffers. This
can for example intervene during HTTP/3 frame header parsing.

To deal with this, a new function is implemented to force data realign
between two contiguous buffers. This is called only when app_ops rcv_buf
returned 0 but data is available in the next buffer after the current
one. In this case, data are transferred from the next into the current
buffer via qcs_transfer_rx_data(). Decoding is then restarted, which
should ensure that app_ops layer has enough data to advance.

During this operation, special care is ensure to removed both
qc_stream_rxbuf entries, as their offset are adjusted. The next buffer
is only reinserted if there is remaining data in it, else it can be
freed.

This case is not easily reproducible as it depends on the HTTP/3 framing
used by the client. It seems to be easily reproduced though with quiche.
$ quiche-client --http-version HTTP/3 --method POST --body /tmp/100m \
  "https://127.0.0.1:20443/post"
2025-03-07 12:06:27 +01:00
Amaury Denoyelle
60f64449fb MAJOR: mux-quic: support multiple QCS RX buffers
Implement support for multiple Rx buffers per QCS instances. This
requires several changes mostly in qcc_recv() / qcc_decode_qcs() which
deal with STREAM frames reception and decoding. These multiple buffers
can be stored in QCS rx.bufs tree which was introduced in an earlier
patch.

On STREAM frame reception, a buffer is retrieved from QCS bufs tree, or
allocated if necessary, based on the data starting offset. Each buffers
are aligned on bufsize for convenience. This ensures there is no overlap
between two contiguous buffers. Special care is taken when dealing with
a STREAM frame which must be splitted and stored in two contiguous
buffers.

When decoding input data, qcc_decode_qcs() is still invoked with a
single buffer as input. This requires a new while loop to ensure
decoding is performed accross multiple contiguous buffers until all data
are decoded or app stream buffer is full.

Also, after qcs_consume() has been performed, the stream Rx channel is
immediately closed if FIN was already received and QCS now contains only
a single buffer with all remaining data. This is necessary as qcc_recv()
is unable to close the Rx channel if FIN is received for a buffer
different from the current readable offset.

Note that for now stream flow-control value is still too low to fully
utilizing this new infrastructure and improve clients upload throughput.
Indeed, flow-control max-stream-data initial values are set to match
bufsize. This ensures that each QCS will use 1 buffer, or at most 2 if
data are splitted. A future patch will increase this value to unblock
this limitation.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
7b168e356f MINOR: mux-quic: adapt return value of qcc_decode_qcs()
Change return value of qcc_decode_qcs(). It now directly returns the
value from app_ops rcv_buf callback. Function documentation is updated
to reflect this.

For now, qcc_decode_qcs() return value is ignored by callers, so this
patch should not have any functional change. However, it will become
necessary when implementing multiple Rx buffers per QCS, as a loop will
be implemented to invoke qcc_decode_qcs() on several contiguous buffers.
Decoding must be stopped however as soon as an error is returned by
rcv_buf callback. This is also the case in case of a null value, which
indicates there is not enough data to continue decoding.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
6b5607d66f MINOR: mux-quic: adjust Rx data consumption API
HTTP/3 data are converted into HTX via qcc_decode_qcs() function. On
completion, these data are removed from QCS Rx buffer via qcs_consume().

This patch adjust qcs_consume() API with several changes. Firstly, the
Rx buffer instance to operate on must now be specified as a new argument
to the function. Secondly, buffer liberation when all data were removed
from qcs_consume() is extracted up to qcc_decode_qcs() caller.

No functional change with this patch. The objective is to have an API
which can be better adapted to multiple Rx buffers per QCS instance.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
a4f31ffeeb MINOR: mux-quic: store QCS Rx buf in a single-entry tree
Convert QCS rx buffer pointer to a tree container. Additionnaly, offset
field of qc_stream_rxbuf is thus transformed into a node tree.

For now, only a single Rx buffer is stored at most in QCS tree. Multiple
Rx buffers will be implemented in a future patch to improve QUIC clients
upload throughput.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
cc3c2d1f12 MINOR: mux-quic: define rxbuf wrapper
Define a new type qc_stream_rxbuf. This is used as a wrapper around QCS
Rx buffer with encapsulation of the ncbuf storage. It is allocated via a
new pool. Several functions are adapted to be able to deal with
qc_stream_rxbuf as a wrapper instead of the previous plain ncbuf
instance.

No functional change should happen with this patch. For now, only a
single qc_stream_rxbuf can be instantiated per QCS. However, this new
type will be useful to implement multiple Rx buffer storage in a future
commit.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
4b1e63d191 MINOR: mux-quic: define globally stream rxbuf size
QCS uses ncbuf for STREAM data storage. This serves as a limit for
maximum STREAM buffering capacity, advertised via QUIC transport
parameters for initial flow-control values.

Define a new function qmux_stream_rx_bufsz() which can be used to
retrieve this Rx buffer size. This can be used both in MUX/H3 layers and
in QUIC transport parameters.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
7dd1eec2b1 MINOR: mux-quic: refine reception of standalone STREAM FIN
Reception of standalone STREAM FIN is a corner case, which may be
difficult to handle. In particular, care must be taken to ensure app_ops
rcv_buf() is always called to be notify about FIN, even if Rx buffer is
empty or full demux flag is set. If this is the case, it could prevent
closure of QCS Rx channel.

To ensure this, rcv_buf() was systematically called if FIN was received,
with or without data payload. This could called unnecessary invokation
when FIN is transmitted with data and full demux flag is set, or data
are received out-of-order.

This patches improve qcc_recv() by detecting explicitely a standalone
FIN case. Thus, rcv_buf() is only forcefully called in this case and if
all data were already previously received.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
20dc8e4ec2 MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN
STREAM FIN may be received without any payload. However, qcc_recv()
always called qcs_get_ncbuf() indiscriminately, which may allocate a QCS
Rx buffer. This is unneeded as there is no payload to store.

Improve this by skipping qcs_get_ncbuf() invokation when dealing with a
standalone FIN signal. This should prevent superfluous buffer
allocation.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
861b11334c MINOR: h3/hq-interop: restore function for standalone FIN receive
Previously, a function qcs_http_handle_standalone_fin() was implemented
to handle a received standalone FIN, bypassing app_ops layer decoding.
However, this was removed as app_ops layer interaction is necessary. For
example, HTTP/3 checks that FIN is never sent on the control uni stream.

This patch reintroduces qcs_http_handle_standalone_fin(), albeit in a
slightly diminished version. Most importantly, it is now the
responsibility of the app_ops layer itself to use it, to avoid the
shortcoming described above.

The main objective of this patch is to be able to support standalone FIN
in HTTP/0.9 layer. This is easily done via the reintroduction of
qcs_http_handle_standalone_fin() usage. This will be useful to perform
testing, as standalone FIN is a corner case which can easily be broken.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
6f95d0dad0 TESTS: quic: create first quic unittest
Define a first unit-test dedicated to QUIC. A single test for now
ensures that variable length decoding is compliant. This should be
extended in the future with new set of tests.
2025-03-07 12:06:26 +01:00
Willy Tarreau
5e558c1727 MINOR: stream/cli: make "show sess" support filtering on front/back/server
With "show sess", particularly "show sess all", we're often missing the
ability to inspect only streams attached to a frontend, backend or server.
Let's just add these filters to the command. Only one at a time may be set.

One typical use case could be to dump streams attached to a server after
issuing "shutdown sessions server XXX" to figure why any wouldn't stop
for example.
2025-03-07 10:38:12 +01:00
Willy Tarreau
2bd7cf53cb MINOR: stream/cli: rework "show sess" to better consider optional arguments
The "show sess" CLI command parser is getting really annoying because
several options were added in an exclusive mode as the single possible
argument. Recently some cumulable options were added ("show-uri") but
the older ones were not yet adapted. Let's just make sure that the
various filters such as "older" and "age" now belong to the options
and leave only <id>, "all", and "help" for the first ones. The doc was
updated and it's now easier to find these options.
2025-03-07 10:36:58 +01:00
Willy Tarreau
1cdf2869f6 BUG/MINOR: stream: fix age calculation in "show sess" output
The "show sess" output reports an age that's based on the last byte of
the HTTP request instead of the stream creation date, due to a confusion
between logs->request_ts and the request_date sample fetch function. Most
of the time these are equal except when the request is not yet full for
any reason (e.g. wait-body). This explains why a few "show sess" could
report a few new streams aged by 99 days for example.

Let's perform the correct request timestamp calculation like the sample
fetch function does, by adding t_idle and t_handshake to the accept_ts.
Now the stream's age is correct and can be correctly used with the
"show sess older <age>" variant.

This issue was introduced in 2.9 and the fix can be backported to 3.0.
2025-03-07 10:36:58 +01:00
Aurelien DARRAGON
dbb25720dd MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines
Invalid (incomplete) "server" or "peer" lines under peers section are now
properly ignored. For completeness, in this patch we add some reports so
that the user knows that incomplete lines were ignored.

For an incomplete server line, since it is tolerated (see GH #565), we
only emit a diag warning.

For an incomplete peer line, we report a real warning, as it is not
expected to have a peer line without an address:port specified.

Also, 'newpeer == curpeers->local' check could be simplified since
we already have the 'local_peer' variable which tells us that the
parsed line refers to a local peer.
2025-03-07 09:39:51 +01:00
Aurelien DARRAGON
a76b5358f0 BUG/MINOR: server: dont return immediately from parse_server() when skipping checks
If parse_server() is called under peers section parser, and the address
needs to be parsed but it is missing, we directly return from the function

However since 0fc136ce5b ("REORG: server: use parsing ctx for server
parsing"), parse_server() uses parsing ctx to emit warning/errors, and
the ctx must be reset before returning from the function, yet this early
return was overlooked. Because of that, any ha_{warning,alert..} message
reported after early return from parse_server() could cause messages to
have an extra "parsing [file:line]" info.

We fix that by ensuring parse_server() doesn't return without resetting
the parsing context.

It should be backported up to 2.6
2025-03-07 09:39:46 +01:00
Aurelien DARRAGON
054443dfb9 BUG/MINOR: cfgparse/peers: properly handle ignored local peer case
In 8ba10fea6 ("BUG/MINOR: peers: Incomplete peers sections should be
validated."), some checks were relaxed in parse_server(), and extra logic
was added in the peers section parser in an attempt to properly ignore
incomplete "server" or "peer" statement under peers section.

This was done in response to GH #565, the main intent was that haproxy
should already complain about incomplete peers section (ie: missing
localpeer).

However, 8ba10fea69 explicitly skipped the peer cleanup upon missing
srv association for local peers. This is wrong because later haproxy
code always assumes that peer->srv is valid. Indeed, we got reports
that the (invalid) config below would cause segmentation fault on
all stable versions:

 global
   localpeer 01JM0TEPAREK01FQQ439DDZXD8

 peers my-table
   peer 01JM0TEPAREK01FQQ439DDZXD8

 listen dummy
   bind localhost:8080

To fix the issue, instead of by-passing some cleanup for the local
peer, handle this case specifically by doing the regular peer cleanup
and reset some fields set on the curpeers and curpeers proxy because
of the invalid local peer (do as if the peer was not declared).

It should still comply with requirements from #565.

This patch should be backported to all stable versions.
2025-03-06 22:05:29 +01:00
Aurelien DARRAGON
2560ab892f BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server
In the "peers" section parser, right after parse_server() is called, we
used to check whether the curpeers->peers_fe->srv pointer was set or not
to know if parse_server() successfuly added a server to the peers proxy,
server that we can then associate to the new peer.

However the check is wrong, as curpeers->peers_fe->srv points to the
last added server, if a server was successfully added before the
failing one, we cannot detect that the last parse_server() didn't
add a server. This is known to cause bug with bad "peer"/"server"
statements.

To fix the issue, we save a pointer on the last known
curpeers->peers_fe->srv before parse_server() is called, and we then
compare the save with the pointer after parse_server(), if the value
didn't change, then parse_server() didn't add a server. This makes
the check consistent in all situations.

It should be backported to all stable versions.
2025-03-06 22:05:24 +01:00