Since now_offset is a static variable and is not exposed outside from
clock.c, let's add an helper so that it becomes possible to set its
value from another source file.
post_section_px_cleanup(), which was implemented in abcc73830
("MEDIUM: proxy: register a post-section cleanup function"), is called
for the current section no matter if the parsing was aborted due to
a fatal error. In this case, the curproxy pointer may point to NULL,
yet post_section_px_cleanup() assumes curproxy pointer is always valid,
which could lead to NULL-deref.
For instance, the config below will cause SEGFAULT:
listen toto titi
To fix the issue, let's simply consider that the curproxy pointer may
be NULL in post_section_px_cleanup(), in which case we skip the cleanup
for the curproxy since there is nothing we can do.
No backport needed
When improper arguments are provided on proxy directive (listen,
frontend or backend), such alert may be emitted:
"please use the 'bind' keyword for listening addresses"
This was introduced in 6e62fb6405 ("MEDIUM: cfgparse: check section
maximum number of arguments"). However, despite the error being reported
as alert, the err_code isn't updated accordingly, which could make the
upper parser think there was no error, while it isn't the case.
In practise since the proxy directive is ignored following proxy related
directives should raise errors, so this didn't cause much harm, yet
better fix that.
It could be backported to all stable versions.
Since 368d01361 (" MEDIUM: server: add and use srv_init() function"), in
case of srv_init() error, we simply increment cfgerr variable and keep
going.
It isn't enough, some treatment occuring later in check_config_validity()
assume that srv_init() succeeded for servers, and may cause undefined
behavior. To fix the issue, let's consider that if (srv_init() & ERR_CODE)
returns true, then we must stop checking the config immediately.
No backport needed unless 368d01361 is.
Previously quic_conn <target> member was used to determine if quic_conn
was used on the frontend (as server) or backend side (as client). A new
helper function can now be used to directly check flag
QUIC_FL_CONN_IS_BACK.
This reduces the dependency between quic_conn and their relative
listener/server instances.
Define a new quic_conn flag assign if the connection is used on the
backend side. This is similar to other haproxy components such as struct
connection and muxes element.
This flag is positionned via qc_new_conn(). Also update quic traces to
mark proxy side as 'F' or 'B' suffix.
QUIC emission can use GSO to emit multiple datagrams with a single
syscall invokation. However, this feature relies on several kernel
parameters which are checked on haproxy process startup.
Even if these checks report no issue, GSO may still be unable due to the
underlying network adapter underneath. Thus, if a EIO occured on
sendmsg() with GSO, listener is flagged to mark GSO as unsupported. This
allows every other QUIC connections to share the status and avoid using
GSO when using this listener.
Previously, listener flag was checked for every QUIC emission. This was
done using an atomic operation to prevent races. Improve this by
duplicating GSO unsupported status as the connection level. This is done
on qc_new_conn() and also on thread rebinding if a new listener instance
is used.
The main benefit from this patch is to reduce the dependency between
quic_conn and listener instances.
Released version 3.3-dev6 with the following main changes :
- MINOR: acme: implement traces
- BUG/MINOR: hlua: take default-path into account with lua-load-per-thread
- CLEANUP: counters: rename counters_be_shared_init to counters_be_shared_prepare
- MINOR: clock: make global_now_ms a pointer
- MINOR: clock: make global_now_ns a pointer as well
- MINOR: mux-quic: release conn after shutdown on BE reuse failure
- MINOR: session: strengthen connection attach to session
- MINOR: session: remove redundant target argument from session_add_conn()
- MINOR: session: strengthen idle conn limit check
- MINOR: session: do not release conn in session_check_idle_conn()
- MINOR: session: streamline session_check_idle_conn() usage
- MINOR: muxes: refactor private connection detach
- BUG/MEDIUM: mux-quic: ensure Early-data header is set
- BUILD: acme: avoid declaring TRACE_SOURCE in acme-t.h
- MINOR: acme: emit a log for DNS-01 challenge response
- MINOR: acme: emit the DNS-01 challenge details on the dpapi sink
- MEDIUM: acme: allow to wait and restart the task for DNS-01
- MINOR: acme: update the log for DNS-01
- BUG/MINOR: acme: possible integer underflow in acme_txt_record()
- BUG/MEDIUM: hlua_fcn: ensure systematic watcher cleanup for server list iterator
- MINOR: sample: Add le2dec (little endian to decimal) sample fetch
- BUILD: fcgi: fix the struct name of fcgi_flt_ctx
- BUILD: compat: provide relaxed versions of the MIN/MAX macros
- BUILD: quic: use _MAX() to avoid build issues in pools declarations
- BUILD: compat: always set _POSIX_VERSION to ease comparisons
- MINOR: implement ha_aligned_alloc() to return aligned memory areas
- MINOR: pools: support creating a pool from a pool registration
- MINOR: pools: add a new flag to declare static registrations
- MINOR: pools: force the name at creation time to be a const.
- MEDIUM: pools: change the static pool creation to pass a registration
- DEBUG: pools: store the pool registration file name and line number
- DEBUG: pools: also retrieve file and line for direct callers of create_pool()
- MEDIUM: pools: add an alignment property
- MINOR: pools: add macros to register aligned pools
- MINOR: pools: add macros to declare pools based on a struct type
- MEDIUM: pools: respect pool alignment in allocations
Now pool_alloc_area() takes the alignment in argument and makes use
of ha_aligned_malloc() instead of malloc(). pool_alloc_area_uaf()
simply applies the alignment before returning the mapped area. The
pool_free() functionn calls ha_aligned_free() so as to permit to use
a specific API for aligned alloc/free like mingw requires.
Note that it's possible to see warnings about mismatching sized
during pool_free() since we know both the pool and the type. In
pool_free, adding just this is sufficient to detect potential
offenders:
WARN_ON(__alignof__(*__ptr) > pool->align);
DECLARE_TYPED_POOL() and friends take a name, a type and an extra
size (to be added to the size of the element), and will use this
to create the pool. This has the benefit of letting the compiler
automatically adapt sizeof() and alignof() based on the type
declaration.
This adds an alignment argument to create_pool_from_loc() and
completes the existing low-level macros with new ones that expose
the alignment and the new macros permit to specify it. For now
they're not used.
This will be used to declare aligned pools. For now it's not used,
but it's properly set from the various registrations that compose
a pool, and rounded up to the next power of 2, with a minimum of
sizeof(void*).
The alignment is returned in the "show pools" part that indicates
the entry size. E.g. "(56 bytes/8)" means 56 bytes, aligned by 8.
Just like previous patch, we want to retrieve the location of the caller.
For this we turn create_pool() into a macro that collects __FILE__ and
__LINE__ and passes them to the now renamed function create_pool_with_loc().
Now the remaining ~30 pools also have their location stored.
When pools are declared using DECLARE_POOL(), REGISTER_POOL etc, we
know where they are and it's trivial to retrieve the file name and line
number, so let's store them in the pool_registration, and display them
when known in "show pools detailed".
Now we're creating statically allocated registrations instead of
passing all the parameters and allocating them on the fly. Not only
this is simpler to extend (we're limited in number of INITCALL args),
but it also leaves all of these in the data segment where they are
easier to find when debugging.
This is already the case as all names are constant so that's fine. If
it would ever change, it's not very hard to just replace it in-situ
via an strdup() and set a flag to mention that it's dynamically
allocated. We just don't need this right now.
One immediately visible effect is in "show pools detailed" where the
names are no longer truncated.
We've recently introduced pool registrations to be able to enumerate
all pool creation requests with their respective parameters, but till
now they were only used for debugging ("show pools detailed"). Let's
go a step further and split create_pool() in two:
- the first half only allocates and sets the pool registration
- the second half creates the pool from the registration
This is what this patch does. This now opens the ability to pre-create
registrations and create pools directly from there.
We have two versions, _safe() which verifies and adjusts alignment,
and the regular one which trusts the caller. There's also a dedicated
ha_aligned_free() due to mingw.
The currently detected OSes are mingw, unixes older than POSIX 200112
which require memalign(), and those post 200112 which will use
posix_memalign(). Solaris 10 reports 200112 (probably through
_GNU_SOURCE since it does not do it by default), and Solaris 11 still
supports memalign() so for all Solaris we use memalign(). The memstats
wrappers are also implemented, and have the exported names. This was
the opportunity for providing a separate free call that lets the caller
specify the size (e.g. for use with pools).
For now this code is not used.
Sometimes we need to compare it to known versions, let's make sure it's
always defined. We set it to zero if undefined so that it cannot match
any comparison.
With the upcoming pool declaration, we're filling a struct's fields,
while older versions were relying on initcalls which could be turned
to function declarations. Thus the compound expressions that were
usable there are not necessarily anymore, as witnessed here with
gcc-5.5 on solaris 10:
In file included from include/haproxy/quic_tx.h:26:0,
from src/quic_tx.c:15:
include/haproxy/compat.h:106:19: error: braced-group within expression allowed only inside a function
#define MAX(a, b) ({ \
^
include/haproxy/pool.h:41:11: note: in definition of macro '__REGISTER_POOL'
.size = _size, \
^
...
include/haproxy/quic_tx-t.h:6:29: note: in expansion of macro 'MAX'
#define QUIC_MAX_CC_BUFSIZE MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU)
Let's make the macro use _MAX() instead of MAX() since it relies on pure
constants.
In 3.0 the MIN/MAX macros were converted to compound expressions with
commit 0999e3d959 ("CLEANUP: compat: make the MIN/MAX macros more
reliable"). However with older compilers these are not supported out
of code blocks (e.g. to initialize variables or struct members). This
is the case on Solaris 10 with gcc-5.5 when QUIC doesn't compile
anymore with the future pool registration:
In file included from include/haproxy/quic_tx.h:26:0,
from src/quic_tx.c:15:
include/haproxy/compat.h:106:19: error: braced-group within expression allowed only inside a function
#define MAX(a, b) ({ \
^
include/haproxy/pool.h:41:11: note: in definition of macro '__REGISTER_POOL'
.size = _size, \
^
...
include/haproxy/quic_tx-t.h:6:29: note: in expansion of macro 'MAX'
#define QUIC_MAX_CC_BUFSIZE MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU)
Let's provide the old relaxed versions as _MIN/_MAX for use with constants
like such cases where it's certain that there is no risk. A previous attempt
using __builtin_constant_p() to switch between the variants did not work,
and it's really not worth the hassle of going this far.
The struct was mistakenly spelled flt_fcgi_ctx() in fcgi_flt_stop()
when it was introduced in 2.1 with commit 78fbb9f991 ("MEDIUM:
fcgi-app: Add FCGI application and filter"), causing build issues
when trying to get the alignment of the object in pool_free() for
debugging purposes. No backport is needed as it's just used to convey
a pointer.
This commit introduces a sample fetch, `le2dec`, to convert
little-endian binary input samples into their decimal representations.
The function converts the input into a string containing unsigned
integer numbers, with each number derived from a specified number of
input bytes. The numbers are separated using a user-defined separator.
This new sample is achieved by adding a parametrized sample_conv_2dec
function, unifying the logic for be2dec and le2dec converters.
Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>
[wt: tracked as GH issue #2915]
Signed-off-by: Willy Tarreau <w@1wt.eu>
In 358166a ("BUG/MINOR: hlua_fcn: restore server pairs iterator pointer
consistency"), I wrongly assumed that because the iterator was a temporary
object, no specific cleanup was needed for the watcher.
In fact watcher_detach() is not only relevant for the watcher itself, but
especially for its parent list to remove the current watcher from it.
As iterators are temporary objects, failing to remove their watchers from
the server watcher list causes the server watcher list to be corrupted.
On a normal iteration sequence, the last watcher_next() receives NULL
as target so it successfully detaches the last watcher from the list.
However the corner case here is with interrupted iterators: users are
free to break away from the iteration loop when a specific condition is
met for instance from the lua script, when this happens
hlua_listable_servers_pairs_iterator() doesn't get a chance to detach the
last iterator.
Also, Lua doesn't tell us that the loop was interrupted,
so to fix the issue we rely on the garbage collector to force a last
detach right before the object is freed. To achieve that, watcher_detach()
was slightly modified so that it becomes possible to call it without
knowing if the watcher is already detached or not, if watcher_detach() is
called on a detached watcher, the function does nothing. This way it saves
the caller from having to track the watcher state and makes the API a
little more convenient to use. This way we now systematically call
watcher_detach() for server iterators right before they are garbage
collected.
This was first reported in GH #3055. It can be observed when the server
list is browsed one than more time when it was already browsed from Lua
for a given proxy and the iteration was interrupted before the end. As the
watcher list is corrupted, the common symptom is watcher_attach() or
watcher_next() not ending due to the internal mt_list call looping
forever.
Thanks to GH users @sabretus and @sabretus for their precious help.
It should be backported everywhere 358166a was.
a2base64url() can return a negative value is olen is too short to
accept ilen. This is not supposed to happen since the sha256 should
always fit in a buffer. But this is confusing since a2base64()
returns a signed integer which is pt in output->data which is unsigned.
Fix the issue by setting ret to 0 instead of -1 upon error. And returns
a unsigned integer instead of a signed one.
This patch also checks the return value from the caller in order
to emit an error instead of setting trash.data which is already done
from the function.
DNS-01 needs a external process which would register a TXT record on a
DNS provider, using a REST API or something else.
To achieve this, the process should read the dpapi sink and wait for
events. With the DNS-01 challenge, HAProxy will put the task to sleep
before asking the ACME server to achieve the challenge. The task then
need to be woke up, using the command implemented by this patch.
This patch implements the "acme challenge_ready" command which should be
used by the agent once the challenge was configured in order to wake the
task up.
Example:
echo "@1 acme challenge_ready foobar.pem.rsa domain kikyo" | socat /tmp/master.sock -
This commit adds a new message to the dpapi sink which is emitted during
the new authorization request.
One message is emitted by challenge to resolve. The certificate name as
well as the thumprint of the account key are on the first line of the
message. A dump of the JSON response for 1 challenge is dumped, en the
message ends with a \0.
The agent consuming these messages MUST NOT access the URLs, and SHOULD
only uses the thumbprint, dns and token to configure a challenge.
Example:
$ ( echo "@@1 show events dpapi -w -0"; cat - ) | socat /tmp/master.sock - | cat -e
<0>2025-08-01T16:23:14.797733+02:00 acme deploy foobar.pem.rsa thumbprint Gv7pmGKiv_cjo3aZDWkUPz5ZMxctmd-U30P2GeqpnCo$
{$
"status": "pending",$
"identifier": {$
"type": "dns",$
"value": "foobar.com"$
},$
"challenges": [$
{$
"type": "dns-01",$
"url": "https://0.0.0.0:14000/chalZ/1o7sxLnwcVCcmeriH1fbHJhRgn4UBIZ8YCbcrzfREZc",$
"token": "tvAcRXpNjbgX964ScRVpVL2NXPid1_V8cFwDbRWH_4Q",$
"status": "pending"$
},$
{$
"type": "dns-account-01",$
"url": "https://0.0.0.0:14000/chalZ/z2_WzibwTPvE2zzIiP3BF0zNy3fgpU_8Nj-V085equ0",$
"token": "UedIMFsI-6Y9Nq3oXgHcG72vtBFWBTqZx-1snG_0iLs",$
"status": "pending"$
},$
{$
"type": "tls-alpn-01",$
"url": "https://0.0.0.0:14000/chalZ/AHnQcRvZlFw6e7F6rrc7GofUMq7S8aIoeDileByYfEI",$
"token": "QhT4ejBEu6ZLl6pI1HsOQ3jD9piu__N0Hr8PaWaIPyo",$
"status": "pending"$
},$
{$
"type": "http-01",$
"url": "https://0.0.0.0:14000/chalZ/Q_qTTPDW43-hsPW3C60NHpGDm_-5ZtZaRfOYDsK3kY8",$
"token": "g5Y1WID1v-hZeuqhIa6pvdDyae7Q7mVdxG9CfRV2-t4",$
"status": "pending"$
}$
],$
"expires": "2025-08-01T15:23:14Z"$
}$
^@
This commit emits a log which output the TXT entry to create in case of
DNS-01. This is useful in cases you want to update your TXT entry
manually.
Example:
acme: foobar.pem.rsa: DNS-01 requires to set the "acme-challenge.example.com" TXT record to "7L050ytWm6ityJqolX-PzBPR0LndHV8bkZx3Zsb-FMg"
Files ending with '-t.h' are supposed to be used for structure
definitions and could be included in the same file to check API
definitions.
This patch removes TRACE_SOURCE from acme-t.h to avoid conflicts with
other TRACE_SOURCE definitions.
QUIC MUX may be initialized prior to handshake completion, when 0-RTT is
used. In this case, connection is flagged with CO_FL_EARLY_SSL_HS, which
is notably used by wait-for-hs http rule.
Early data may be subject to replay attacks. For this reason, haproxy
adds the header 'Early-data: 1' to all requests handled as TLS early
data. Thus the server can reject it if it is deemed unsafe. This header
injection is implemented by http-ana. However, it was not functional
with QUIC due to missing CO_FL_EARLY_DATA connection flag.
Fix this by ensuring that QUIC MUX sets CO_FL_EARLY_DATA when needed.
This is performed during qcc_recv() for STREAM frame reception. It is
only set if QC_CF_WAIT_HS is set, meaning that the handshake is not yet
completed. After this, the request is considered safe and Early-data
header is not necessary anymore.
This should fix github issue #3054.
This must be backported up to 3.2 at least. If possible, it should be
backported to all stable releases as well. On these versions, the
current patch relies on the following refactoring commit :
commit 0a53a008d0
MINOR: mux-quic: refactor wait-for-handshake support
Following the latest adjustment on session_add_conn() /
session_check_idle_conn(), detach muxes callbacks were rewritten for
private connection handling.
Nothing really fancy here : some more explicit comments and the removal
of a duplicate checks on idle conn status for muxes with true
multipexing support.
session_check_idle_conn() is called by muxes when a connection becomes
idle. It ensures that the session idle limit is not yet reached. Else,
the connection is removed from the session and it can be freed.
Prior to this patch, session_check_idle_conn() was compatible with a
NULL session argument. In this case, it would return true, considering
that no limit was reached and connection not removed.
However, this renders the function error-prone and subject to future
bugs. This patch streamlines it by ensuring it is never called with a
NULL argument. Thus it can now only returns true if connection is kept
in the session or false if it was removed, as first intended.
session_check_idle_conn() is called to flag a connection already
inserted in a session list as idle. If the session limit on the number
of idle connections (max-session-srv-conns) is exceeded, the connection
is removed from the session list.
In addition to the connection removal, session_check_idle_conn()
directly calls MUX destroy callback on the connection. This means the
connection is freed by the function itself and should not be used by the
caller anymore.
This is not practical when an alternative connection closure method
should be used, such as a graceful shutdown with QUIC. As such, remove
MUX destroy invokation : this is now the responsability of the caller to
either close or release immediately the connection.
Add a BUG_ON() on session_check_idle_conn() to ensure the connection is
not already flagged as CO_FL_SESS_IDLE.
This checks that this function is only called one time per connection
transition from active to idle. This is necessary to ensure that session
idle counter is only incremented one time per connection.
session_add_conn() uses three argument : connection and session
instances, plus a void pointer labelled as target. Typically, it
represents the server, but can also be a backend instance (for example
on dispatch).
In fact, this argument is redundant as <target> is already a member of
the connection. This commit simplifies session_add_conn() by removing
it. A BUG_ON() on target is extended to ensure it is never NULL.
This commit is the first one of a serie to refactor insertion of backend
private connection into the session list.
session_add_conn() is used to attach a connection into a session list.
Previously, this function would report an error if the connection
specified was already attached to another session. However, this case
currently never happens and thus can be considered as buggy.
Remove this check and replace it with a BUG_ON(). This allows to ensure
that session insertion remains consistent. The same check is also
transformed in session_check_idle_conn().
On stream detach on backend side, connection is inserted in the proper
server/session list to be able to reuse it later. If insertion fails and
the connection is idle, the connection can be removed immediately.
If this occurs on a QUIC connection, QUIC MUX implements graceful
shutdown to ensure the server is notified of the closure. However, the
connection instance is not freed. Change this to ensure that both
shutdown and release is performed.
This is preparation work for shared counters between co-processes. As
co-processes will need to share a common date. global_now_ms will be used
for that as it will point to the shm when sharing is enabled.
Thus in this patch we turn global_now_ms into a pointer (and adjust the
places where it is written to and read from, hopefully atomic operations
through pointer are already used so the change is trivial)
For now global_now_ms points to process-local _global_now_ms which is a
fallback for when sharing through the shm is not enabled.
75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing the shared
stats directly in counters struct") took care of renaming
counters_fe_shared_init() but we forgot counters_be_shared_init().
Let's fix that for consistency
As discussed in GH #3051, default-path is not taken into account when
loading files using lua-load-per-thread. In fact, the initial
hlua_load_state() (performed on first thread which parses the config)
is successful, but other threads run hlua_load_state() later based
on config hints which were saved by the first thread, and those config
hints only contain the file path provided on the lua-load-per-thread
config line, not the absolute one. Indeed, `default-path` directive
changes the current working directory only for the thread parsing the
configuration.
To fix the issue, when storing config hints under hlua_load_per_thread()
we now make sure to save the absolute file path for `lua-load-per-thread'
argument.
Thanks to GH user @zhanhb for having reported the issue
It may be backported to all stable versions.
Implement traces for the ACME protocol.
-dt acme:data:complete will dump every input and output buffers,
including decoded buffers before being converted to JWS.
It will also dump certificates in the traces.
-dt acme:user:complete will only dump the state of the task handler.
Released version 3.3-dev5 with the following main changes :
- BUG/MEDIUM: queue/stats: also use stream_set_srv_target() for pendconns
- DOC: list missing global QUIC settings
Complete list of global keywords with missing QUIC entries.
This could be backported to stable versions. This requires to take into
account the version of introduction for each keyword.
* limited-quic, introduced in 2.8
* no-quic, introduced in 2.8
* tune.quic.cc.cubic.min-losses, introduced in 3.1
Following c24de07 ("OPTIM: stats: store fast sharded counters pointers
at session and stream level") some crashes were observed in
connect_server():
#0 0x00000000007ba39c in connect_server (s=0x65117b0) at src/backend.c:2101
2101 _HA_ATOMIC_INC(&s->sv_tgcounters->connect);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-325.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 nss-softokn-freebl-3.67.0-3.el7_9.x86_64 pcre-8.32-17.el7.x86_64
(gdb) bt
#0 0x00000000007ba39c in connect_server (s=0x65117b0) at src/backend.c:2101
#1 0x00000000007baff8 in back_try_conn_req (s=0x65117b0) at src/backend.c:2378
#2 0x00000000006c0e9f in process_stream (t=0x650f180, context=0x65117b0, state=8196) at src/stream.c:2366
#3 0x0000000000bd3e51 in run_tasks_from_lists (budgets=0x7ffd592752e0) at src/task.c:655
#4 0x0000000000bd49ef in process_runnable_tasks () at src/task.c:889
#5 0x0000000000851169 in run_poll_loop () at src/haproxy.c:2834
#6 0x0000000000851865 in run_thread_poll_loop (data=0x1a03580 <ha_thread_info>) at src/haproxy.c:3050
#7 0x0000000000852a53 in main (argc=7, argv=0x7ffd592755f8) at src/haproxy.c:3637
Here the crash occurs during the atomic inc of a sv_tgcounters metric from
the stream pointer, which tells us the pointer is likely garbage.
In fact, we assign s->sv_tgcounters each time the stream target is set to
a valid server. For that we use stream_set_srv_target() helper which does
assigment for us. By reviewing the code, in turns out we forgot to call
stream_set_srv_target() in pendconn_dequeue(), where the stream target
is set to the server who picked the pendconn.
Let's fix the bug by using stream_set_srv_target() there.
No backport needed unless c24de07 is.