"log-steps" was already ignored if directly defined in a backend section,
however, when defined in a defaults section it was inherited to all
proxies no matter their capability (ie: including backends).
As configurations often contain more backends than frontends, this would
result in wasted memory given that the log-steps setting is only
considered on frontends.
Let's fix that by preventing the inheritance from defaults section to
anything else than frontends. Also adjust the documentation to mention
that the setting in not relevant for backends.
Due to the introduction of smallbuf usage for HTTP/3 headers emission,
ret variable may be used uninitialized if buffer allocation fails due to
not enough room in QUIC connection window.
Fix this by setting ret value to 0.
Function variable declaration are also adjusted so that the pattern is
similar to h3_resp_headers_send(). Finally, outbuf buffer is also
removed as it is now unused.
No need to backport.
Similarly to HTTP/3 response encoding, a small buffer is first allocated
for the request encoding on the backend side. If this is not sufficient,
the smallbuf is replaced by a standard buffer and encoding is restarted.
This is useful to reduce the window usage over a connection of smaller
requests.
SSL libraries like wolfSSL that don't have the clienthello callback
mechanism enabled do not need to have the traces that are only called
from the said callback.
The code added to parse the ciphers relied on a function that wes not
defined in wolfSSL (SSL_CIPHER_find).
The contents of the extensions were only dumped with verbosity
'complete' which meant that the 'advanced' verbosity was pretty much
useless despite what its name implies (it was the same as the 'simple'
one).
The 'advanced' verbosity is now the "maximum" one, using 'complete'
would not add any extra information yet, but it leaves more room for
some actually large traces to be dumped later on (some complete
ClientHello dumps for instance).
The SSL libraries like OpenSSL for instance do not seem to actually
provide a public mapping between IANA defined curve IDs and curve names,
or even a mapping between curve IDs and internal NIDs.
This new table regroups all those information in a single table so that
we can convert curve names (be it SECG or NIST format) to curve IDs or
NIDs.
The previously existing 'curves2nid' function now uses the new table,
and a new 'curveid2str' one is added.
Since the following patch, app_ops layer is now responsible to report
that HTX block was the last transmitted so that FIN STREAM can be set.
This is mandatory to properly support HTTP 1xx interim responses.
f349df44b4
MINOR: qmux: change API for snd_buf FIN transmission
This change was correctly implemented in HTTP/3 code, however an issue
appeared on hq-interop transcoder in case zero-copy DATA transfer is
performed when HTX buffer is swapped. If this occured during the
transfer of the last HTX block, EOM is not detected and thus STREAM FIN
is never set.
Most of the times, QMUX shut callback is called immediately after. This
results in an emission of a RESET_STREAM to the client, which prevents
the data transfer.
To fix this, use the same method as HTTP/3 : HTX EOM flag status is
checked before any transfer, thus preserving it even after a zero-copy.
Criticity of this bug is low as hq-interop is experimental and is mostly
used for interop testing.
This should fix github issue #3038.
This patch must be backported wherever the above one is.
Willy noticed that it was not possible to select extra log origins using
log-steps directive. Extra origins are the one registered using
log_orig_register() such as http-req.
Reason was the error path was always executed during extra log origin
matching for log-steps parser, while it should only be executed if no
match was found.
It should be backported to 3.1.
Don't use the workaround to load libgcc_s on macOS. It is not needed
there, and it causes issues, as recent macOS dislike processes that fork
after threads where created (and the workaround creates a temporary
thread). This fixes crashes on macOS at least when using master-worker,
and using the system resolver.
This should fix Github issue #3035
This should be backported up to 2.8.
Not all platforms support thread-cpu bindings, so let's put
cpu_topo_dump_summary() under USE_CPU_AFFINITY guards.
Only needs to be backported if 1cc0e023ce ("MINOR: debug: add thread-cpu
bindings info in 'show dev' output") is backported.
This patch impacts the QUIC frontends. It reverts this patch
MINOR: quic-be: add a "CC connection" backend TX buffer pool
which adds <pool_head_quic_be_cc_buf> new pool to allocate CC (connection closed state)
TX buffers with bigger object size than the one for <pool_head_quic_cc_buf>.
Indeed the QUIC backends must be able to send at least 1200 bytes Initial packets.
For now on, both the QUIC frontends and backend use the same pool with
MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU)(1252 bytes) as object size.
Align titles style of debug_parse_cli_show_dev() with
cpu_dump_topology(). We will call the latter inside of
debug_parse_cli_show_dev() to show thread-cpu bindings info.
cpu_dump_topology() prints details about each enabled CPU and a summary with
clusters info and thread-cpu bindings. The latter is often usefull for
debugging and we want to add it in the 'show dev' output.
So, let's split cpu_dump_topology() in two parts: cpu_topo_debug() to print the
details about each enabled CPU; and cpu_topo_dump_summary() to print only the
summary.
In the next commit we will modify cpu_topo_dump_summary() to write into local
trash buffer and it could be easily called from debug_parse_cli_show_dev().
Exit with an error if multiple output filters (-ic, -srv, -st, -tc, -u*, etc.)
are used at the same time.
halog is designed to process and display output for only one filter at a time.
Using multiple filters simultaneously can cause a crash because the program is
not designed to manage multiple, separate result sets (e.g., one for
IP counts, another for URLs).
Supporting simultaneous filters would require a redesign to collect entries for
each filter in separate ebtree. This would negatively impact performance and is
not requested for the moment. This patch prevents the crash by checking filter
combinations just after the command line parsing.
This issue was reported in GitHUB #3031.
This should be backported in all stable versions.
The "connection close state" TX buffer is used to build the datagram with
basically a CONNECTION_CLOSE frame to notify the peer about the connection
closure. It allows the quic_conn memory release and its replacement by a lighter
quic_cc_conn struct.
For the QUIC backend, there is a dedicated pool to build such datagrams from
bigger TX buffers. But from quic_conn_release(), this is the pool dedicated
to the QUIC frontends which was used to release the QUIC backend TX buffers.
This patch simply adds a test about the target of the connection to release
the "connection close state" TX buffers from the correct pool.
No backport needed.
It's super difficult to find the rules that operate idle conns depending
on their idle/safe/avail/private status. Some are in lists, others not.
Some are in trees, others not. Some have a flag set, others not. This
documents the rules before the definitions in connection-t.h. It could
even be backported to help during backport sessions.
Replace all calls to qc_is_listener() (resp. !qc_is_listener()) by calls to
objt_listener() (resp. objt_server()).
Remove qc_is_listener() implement and QUIC_FL_CONN_LISTENER the flag it
relied on.
When an appctx is initialized, there is a BUG_ON() to be sure the appctx is
really initialized on the right thread to avoid bugs on the thread
affinity. However, it is possible to not choose the thread when the appctx
is created and let it starts on any thread. In that case, the thread
affinity is set when the appctx is initialized. So, we must take cate to not
trigger the BUG_ON() in that case.
For now, we never hit the bug because the thread affinity is always set
during the appctx creation.
This patch must be backport as far as 2.8.
The bug is a listener only one, and only occured on FreeBSD.
The FreeBSD issue has been reported here:
https://forums.freebsd.org/threads/quic-http-3-with-haproxy.98443/
where QUIC traces could reveal that sendmsg() calls lead to EINVAL
syscall errnos.
Such a similar issue could be reproduced from a FreeBSD 14-2 VM
with reg-tests/quic/retry.vtc as reg test.
As noted by Olivier, this issue could be fixed within the VM binding
the listener socket to INADDR_ANY.
That said, the symptoms are not exactly the same as the one reporte by the user.
What could be observed from such a VM is that if the first recvmsg() call
returns the datagram destination address, and if the listener
listening address is bound to a specific address, the calls to
sendmsg() fail because of the IP_SENDSRCADDR ip option value
set by cmsg_set_saddr(). According to the ip(4) freebsd manual
such an IP options must be used if the listening socket is
bound to a specific address. It is to be noted that into a VM
the first call to recvmsg() of the first connection does not return the datagram
destination address. This leads the first quic_conn to be initialized without
->local_addr value. This is this value which is used by IP_SENDSRCADDR
ip option. In this case, the sendmsg() calls (without IP_SENDSRCADDR)
never fail. The issue appears at the second condition.
This patch replaces the conditions to use IP_SENDSRCADDR to a call to
qc_may_use_saddr(). This latter also checks that the listener listening
address is not INADDR_ANY to allow the use of the source address.
It is generalized to all the OSes. Indeed, there is no reason to set the source
address when the listener is bound to a specific address.
Must be backported as far as 2.8.
On backend side, H3 layer is responsible to decode a HTTP/3 response
into an HTX message. Multiple responses may be received on a single
stream with interim status codes prior to the final one.
h3_resp_headers_to_htx() is the function used solely on backend side
responsible for H3 response to HTX transcoding. This patch extends it to
be able to properly support interim responses. When such a response is
received, the new flag H3_SF_RECV_INTERIM is set. This is converted to
QMUX qcs flag QC_SF_EOI_SUSPENDED.
The objective of this latter flag is to prevent stream EOI to be
reported during stream rcv_buf callback, even if HTX message contains
EOM and is empty. QC_SF_EOI_SUSPENDED will be cleared when the final
response is finally converted, which unblock stream EOI notification for
next rcv_buf invocations. Note however that HTX EOM is untouched : it is
always set for both interim and final response reception.
As a minor adjustment, HTX_SL_F_BODYLESS is always set for interim
responses.
Contrary to frontend interim response handling, a flag is necessary on
QMUX layer. This is because H3 to HTX transcoding and rcv_buf callback
are two distinct operations, called under different context (MUX vs
stream tasklet).
Also note that H3 layer has two distinct flags for interim response
handling, one only used as a server (FE side) and the other as a client
(BE side). It was preferred to used two distinct flags which is
considered less error-prone, contrary to a single unified flag which
would require to always set the proxy side to ensure it is relevant or
not.
No need to backport.
On frontend side, HTTP/3 layer is responsible to transcode an HTX
response message into HTTP/3 HEADERS frame. This operations is handled
via h3_resp_headers_send().
Prior to this patch, if HTX EOM was encountered in the HTX message after
response transcoding, <fin> was reported to the QMUX layer. This will in
turn cause FIN stream bit to be set when the response is emitted.
However, this is not correct as a single HTX response can be constitued
of several interim message, each delimited by EOM block.
Most of the time, this bug will cause the client to close the connection
as it is invalid to receive an interim response with FIN bit set.
Fixes this by now properly differentiate interim and final response.
During interim response transcoding, the new flag H3_SF_SENT_INTERIM
will be set, which will prevent <fin> to be reported. Thus, <fin> will
only be notified for the final response.
This must be backported up to 2.6. Note that it relies on the previous
patch which also must be taken.
Previous patches have fixes interim response encoding via
h3_resp_headers_send(). However, it is still necessary to adjust h3
layer state-machine so that several successive HTTP responses are
accepted for a single stream.
Prior to this, QMUX was responsible to decree that the final HTX message
was encoded so that FIN stream can be emitted. However, with interim
response, MUX is in fact unable to properly determine this. As such,
this is the responsibility of the application protocol layer. To reflect
this, app_ops snd_buf callback is modified so that a new output argument
<fin> is added to it.
Note that for now this commit does not bring any functional change.
However, it will be necessary for the following patch. As such, it
should be backported prior to it to every versions as necessary.
On frontend side, H3 layer transcodes HTX status code into HTTP/3
HEADERS frame. This is done by calling qpack_encode_int_status().
Prior to this patch, the latter function was also responsible to reject
an invalid value, which guarantee that only valid codes are encoded
(between 100 and 999 values). However, this is not practical as it is
impossible to differentiate between an invalid code error and a buffer
room exhaustation.
Changes this so that now HTTP/3 layer first ensures that HTX code is
valid. The stream is closed with H3_INTERNAL_ERROR if invalid value is
present. Thus, qpack_encode_int_status() will only report an error due
to buffer room exhaustion. If a small buffer is used, a standard buffer
will be reallocated which should be sufficient to encode the response.
The impact of this bug is minimal. Its main benefit is code clarity,
while also removing an unnecessary realloc when confronting with an
invalid HTTP code.
This should be backported at least up to 3.1. Prior to it, smallbuf
mechanism isn't present, hence the impact of this patch is less
important. However, it may still be backported to older versions, which
should facilitate picking patches for HTTP 1xx interim response support.
Previous commit fixes encoding of several following HTTP response
message when interim status codes are first reported. However,
h3_resp_headers_send() still was unable to interrupt encoding if output
buffer room was not sufficient. This case may be likely because small
buffers are used for headers encoding.
This commit fixes this situation. If output buffer is not empty prior to
response encoding, this means that a previous interim response message
was already encoded before. In this case, and if remaining space is not
sufficient, use buffer release mechanism : this allows to restart
response encoding by using a newer buffer. This process has already been
used for DATA and trailers encoding.
This must be backported up to 2.6. However, note that buffer release
mechanism is not present for version 2.8 and lower. In this case, qcs
flag QC_SF_BLK_MROOM should be enough as a replacement.
An HTTP response may contain several interim response message prior (1xx
status) to a final response message (all other status codes). This may
cause issues with h3_resp_headers_send() called for response encoding
which assumes that it is only call one time per stream, most notably
during output buffer handling.
This commit fixes output buffer handling when h3_resp_headers_send() is
called multiple times due to an interim response. Prior to it, interim
response was overwritten with newer response message. Most of the time,
this resulted in error for the client due to QPACK decoding failure.
This is now fixed so that each response is encoded one after the other.
Note that if encoding of several responses is bigger than output buffer,
an error is reported. This can definitely occurs as small buffer are
used during header encoding. This situation will be improved by the next
patch.
This must be backported up to 2.6.
The goal is to help figure the OS version (kernel and userland), any
virtualization/containers, and the haproxy version and build features.
Sometimes even reporters themselve can be mistaken about the running
version or environment. Also printing this at the top hepls draw a
visual delimitation between warnings and panic. Now we get something
like this:
PANIC! Thread 1 is about to kill the process.
HAProxy info:
version: 3.3-dev3-c863c0-18
features: +51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY (...)
Operating system info:
virtual machine: no
container: no
kernel: Linux 6.1.131 #1 SMP PREEMPT_DYNAMIC Fri Mar 14 01:04:55 CET 2025 x86_64
userland: Slackware 15.0 x86_64
* Thread 1 : id=0x7f615a8775c0 act=1 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
1/1 stuck=0 prof=0 harmless=0 isolated=0
cpu_ns: poll=1835010197 now=1835066102 diff=55905
(...)
For listen/frontend/backend, we now want to be able to clean up the
default-server directive that's no longer used past the end of the
section. For this we register a post-section function and perform the
cleanup there.
The default-server entry used to always be allocated. Now we'll postpone
its allocation for the first time we need it, i.e. during a "default-server"
directive, or when inheriting a defaults section which has one. The memory
savings are significant, on a large configuration with 100k backends and
no default-server directive, the memory usage dropped from 800MB RSS to
420MB (380 MB saved). It should be possible to also address configs using
default-server by releasing this entry when leaving the proxy section,
which is not done yet.
Now we only copy the default server's settings if such a default server
exists, otherwise we only initialize it. At the moment it always exists.
The change is mostly performed in srv_settings_cpy() since that's where
each caller passes through, and there's no point duplicating that test
everywhere.
The server struct has gone huge over time (~3.8kB), and having a copy
of it in the defsrv section of the struct proxy costs a lot of RAM,
that is not needed anymore at run time.
This patch replaces this struct with a dynamically allocated one. The
field is allocated and initialized during alloc_new_proxy() and is
freed when the proxy is destroyed for now. But the goal will be to
support freeing it after parsing the section.
The test in srv_ssl_settings_cpy() comparing src to the server's proxy's
default server does work but it's a subtle trap. Indeed, no check is made
on srv->proxy to be valid, and this only works because the compiler is
comparing pointer offsets. During the boot, it's common to have NULL here
in srv->proxy and of course in this case srv does not match that value
which is NULL plus epsilon. But when trying to turn defsrv to a dynamic
pointer instead, then the compiler is forced to dereference this NULL
srv->proxy and dies during init.
Let's always add the null check for srv->proxy before the test to avoid
this situation.
No backport is needed since the problem cannot happen yet.
At a few places we're seeing some open-coding of the same function, likely
because it looks overkill for what it's supposed to do, due to extraneous
tests that are not needed (e.g. check of the backend's PR_CAP_BE etc).
Let's just remove all these superfluous tests and inline it so that it
feels more suitable for use everywhere it's needed.
The server lookup in sticking_rule_find_target() uses an open-coded tree
search while we have a function for this server_find_by_id(). In addition,
due to the way it's coded, the stick-table lock also covers the server
lookup by accident instead of being released earlier. This is not a real
problem though since such feature is rarely used nowadays.
Let's clean all this stuff by first retrieving the ID under the lock and
then looking up the corresponding server.
The code used to detect proxy id conflicts uses an open-coded lookup
in the ID tree which is not necessary since we already have functions
for this. Let's switch to that instead.
This function doesn't just look at the name but also the ID when the
argument starts with a '#'. So the name is not correct and explains
why this function is not always used when the name only is needed,
and why the list-based findserver() is used instead. So let's just
call the function "server_find()", and rename its generation-id based
cousin "server_find_unique()".
Let's just use the tree-based lookup instead of walking through the list.
This function is used to find duplicates in "track" statements and a few
such places, so it's important not to waste too much time on large setups.
findserver() used to check for duplicate server names. These are no
longer accepted in 3.3 so let's get rid of that test and simplify the
code. Note that the function still only uses the list instead of the
tree.
Released version 3.3-dev3 with the following main changes :
- BUG/MINOR: quic-be: Wrong retry_source_connection_id check
- MEDIUM: sink: change the sink mode type to PR_MODE_SYSLOG
- MEDIUM: server: move _srv_check_proxy_mode() checks from server init to finalize
- MINOR: server: move send-proxy* incompatibility check in _srv_check_proxy_mode()
- MINOR: mailers: warn if mailers are configured but not actually used
- BUG/MEDIUM: counters/server: fix server and proxy last_change mixup
- MEDIUM: server: add and use a separate last_change variable for internal use
- MEDIUM: proxy: add and use a separate last_change variable for internal use
- MINOR: counters: rename last_change counter to last_state_change
- MINOR: ssl: check TLS1.3 ciphersuites again in clienthello with recent AWS-LC
- BUG/MEDIUM: hlua: Forbid any L6/L7 sample fetche functions from lua services
- BUG/MEDIUM: mux-h2: Properly handle connection error during preface sending
- BUG/MINOR: jwt: Copy input and parameters in dedicated buffers in jwt_verify converter
- DOC: Fix 'jwt_verify' converter doc
- MINOR: jwt: Rename pkey to pubkey in jwt_cert_tree_entry struct
- MINOR: jwt: Remove unused parameter in convert_ecdsa_sig
- MAJOR: jwt: Allow certificate instead of public key in jwt_verify converter
- MINOR: ssl: Allow 'commit ssl cert' with no privkey
- MINOR: ssl: Prevent delete on certificate used by jwt_verify
- REGTESTS: jwt: Add test with actual certificate passed to jwt_verify
- REGTESTS: jwt: Test update of certificate used in jwt_verify
- DOC: 'jwt_verify' converter now supports certificates
- REGTESTS: restrict execution to a single thread group
- MINOR: ssl: Introduce new smp_client_hello_parse() function
- MEDIUM: stats: add persistent state to typed output format
- BUG/MINOR: httpclient: wrongly named httpproxy flag
- MINOR: ssl/ocsp: stop using the flags from the httpclient CLI
- MEDIUM: httpclient: split the CLI from the actual httpclient API
- MEDIUM: httpclient: implement a way to use directly htx data
- MINOR: httpclient/cli: add --htx option
- BUILD: dev/phash: remove the accidentally committed a.out file
- BUG/MINOR: ssl: crash in ssl_sock_io_cb() with SSL traces and idle connections
- BUILD/MEDIUM: deviceatlas: fix when installed in custom locations.
- DOC: deviceatlas build clarifications
- BUG/MINOR: ssl/ocsp: fix definition discrepancies with ocsp_update_init()
- MINOR: proto-tcp: Add support for TCP MD5 signature for listeners and servers
- BUILD: cfgparse-tcp: Add _GNU_SOURCE for TCP_MD5SIG_MAXKEYLEN
- BUG/MINOR: proto-tcp: Take care to initialized tcp_md5sig structure
- BUG/MINOR: http-act: Fix parsing of the expression argument for pause action
- MEDIUM: httpclient: add a Content-Length when the payload is known
- CLEANUP: ssl: Rename ssl_trace-t.h to ssl_trace.h
- MINOR: pattern: add a counter of added/freed patterns
- CI: set DEBUG_STRICT=2 for coverity scan
- CI: enable USE_QUIC=1 for OpenSSL versions >= 3.5.0
- CI: github: add an OpenSSL 3.5.0 job
- CI: github: update the stable CI to ubuntu-24.04
- BUG/MEDIUM: quic: SSL/TCP handshake failures with OpenSSL 3.5
- CI: github: update to OpenSSL 3.5.1
- BUG/MINOR: quic: Missing TLS 1.3 QUIC cipher suites and groups inits (OpenSSL 3.5 QUIC API)
- BUG/MINOR: quic-be: Malformed coalesced Initial packets
- MINOR: quic: Prevent QUIC backend use with the OpenSSL QUIC compatibility module (USE_OPENSS_COMPAT)
- MINOR: reg-tests: first QUIC+H3 reg tests (QUIC address validation)
- MINOR: quic-be: Set the backend alpn if not set by conf
- MINOR: quic-be: TLS version restriction to 1.3
- MINOR: cfgparse: enforce QUIC MUX compat on server line
- MINOR: server: support QUIC for dynamic servers
- CI: github: skip a ssl library version when latest is already in the list
- MEDIUM: resolvers: switch dns-accept-family to "auto" by default
- BUG/MINOR: resolvers: don't lower the case of binary DNS format
- MINOR: resolvers: do not duplicate the hostname_dn field
- MINOR: proto-tcp: Register a feature to report TCP MD5 signature support
- BUG/MINOR: listener: really assign distinct IDs to shards
- MINOR: quic: Prevent QUIC build with OpenSSL 3.5 new QUIC API version < 3.5.1
- BUG/MEDIUM: quic: Crash after QUIC server callbacks restoration (OpenSSL 3.5)
- REGTESTS: use two haproxy instances to distinguish the QUIC traces
- BUG/MEDIUM: http-client: Don't wake http-client applet if nothing was xferred
- BUG/MEDIUM: http-client: Properly inc input data when HTX blocks are xferred
- BUG/MEDIUM: http-client: Ask for more room when request data cannot be xferred
- BUG/MEDIUM: http-client: Test HTX_FL_EOM flag before commiting the HTX buffer
- BUG/MINOR: http-client: Ignore 1XX interim responses in non-HTX mode
- BUG/MINOR: http-client: Reject any 101-switching-protocols response
- BUG/MEDIUM: http-client: Drain the request if an early response is received
- BUG/MEDIUM: http-client: Notify applet has more data to deliver until the EOM
- BUG/MINOR: h3: fix https scheme request encoding for BE side
- MINOR: h1-htx: Add function to format an HTX message in its H1 representation
- BUG/MINOR: mux-h1: Use configured error files if possible for early H1 errors
- BUG/MINOR: h1-htx: Don't forget to init flags in h1_format_htx_msg function
- CLEANUP: assorted typo fixes in the code, commits and doc
- BUILD: adjust scripts/build-ssl.sh to modern CMake system of QuicTLS
- MINOR: debug: add distro name and version in postmortem