This commit introduces a sample fetch, `le2dec`, to convert
little-endian binary input samples into their decimal representations.
The function converts the input into a string containing unsigned
integer numbers, with each number derived from a specified number of
input bytes. The numbers are separated using a user-defined separator.
This new sample is achieved by adding a parametrized sample_conv_2dec
function, unifying the logic for be2dec and le2dec converters.
Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>
[wt: tracked as GH issue #2915]
Signed-off-by: Willy Tarreau <w@1wt.eu>
In 358166a ("BUG/MINOR: hlua_fcn: restore server pairs iterator pointer
consistency"), I wrongly assumed that because the iterator was a temporary
object, no specific cleanup was needed for the watcher.
In fact watcher_detach() is not only relevant for the watcher itself, but
especially for its parent list to remove the current watcher from it.
As iterators are temporary objects, failing to remove their watchers from
the server watcher list causes the server watcher list to be corrupted.
On a normal iteration sequence, the last watcher_next() receives NULL
as target so it successfully detaches the last watcher from the list.
However the corner case here is with interrupted iterators: users are
free to break away from the iteration loop when a specific condition is
met for instance from the lua script, when this happens
hlua_listable_servers_pairs_iterator() doesn't get a chance to detach the
last iterator.
Also, Lua doesn't tell us that the loop was interrupted,
so to fix the issue we rely on the garbage collector to force a last
detach right before the object is freed. To achieve that, watcher_detach()
was slightly modified so that it becomes possible to call it without
knowing if the watcher is already detached or not, if watcher_detach() is
called on a detached watcher, the function does nothing. This way it saves
the caller from having to track the watcher state and makes the API a
little more convenient to use. This way we now systematically call
watcher_detach() for server iterators right before they are garbage
collected.
This was first reported in GH #3055. It can be observed when the server
list is browsed one than more time when it was already browsed from Lua
for a given proxy and the iteration was interrupted before the end. As the
watcher list is corrupted, the common symptom is watcher_attach() or
watcher_next() not ending due to the internal mt_list call looping
forever.
Thanks to GH users @sabretus and @sabretus for their precious help.
It should be backported everywhere 358166a was.
a2base64url() can return a negative value is olen is too short to
accept ilen. This is not supposed to happen since the sha256 should
always fit in a buffer. But this is confusing since a2base64()
returns a signed integer which is pt in output->data which is unsigned.
Fix the issue by setting ret to 0 instead of -1 upon error. And returns
a unsigned integer instead of a signed one.
This patch also checks the return value from the caller in order
to emit an error instead of setting trash.data which is already done
from the function.
DNS-01 needs a external process which would register a TXT record on a
DNS provider, using a REST API or something else.
To achieve this, the process should read the dpapi sink and wait for
events. With the DNS-01 challenge, HAProxy will put the task to sleep
before asking the ACME server to achieve the challenge. The task then
need to be woke up, using the command implemented by this patch.
This patch implements the "acme challenge_ready" command which should be
used by the agent once the challenge was configured in order to wake the
task up.
Example:
echo "@1 acme challenge_ready foobar.pem.rsa domain kikyo" | socat /tmp/master.sock -
This commit adds a new message to the dpapi sink which is emitted during
the new authorization request.
One message is emitted by challenge to resolve. The certificate name as
well as the thumprint of the account key are on the first line of the
message. A dump of the JSON response for 1 challenge is dumped, en the
message ends with a \0.
The agent consuming these messages MUST NOT access the URLs, and SHOULD
only uses the thumbprint, dns and token to configure a challenge.
Example:
$ ( echo "@@1 show events dpapi -w -0"; cat - ) | socat /tmp/master.sock - | cat -e
<0>2025-08-01T16:23:14.797733+02:00 acme deploy foobar.pem.rsa thumbprint Gv7pmGKiv_cjo3aZDWkUPz5ZMxctmd-U30P2GeqpnCo$
{$
"status": "pending",$
"identifier": {$
"type": "dns",$
"value": "foobar.com"$
},$
"challenges": [$
{$
"type": "dns-01",$
"url": "https://0.0.0.0:14000/chalZ/1o7sxLnwcVCcmeriH1fbHJhRgn4UBIZ8YCbcrzfREZc",$
"token": "tvAcRXpNjbgX964ScRVpVL2NXPid1_V8cFwDbRWH_4Q",$
"status": "pending"$
},$
{$
"type": "dns-account-01",$
"url": "https://0.0.0.0:14000/chalZ/z2_WzibwTPvE2zzIiP3BF0zNy3fgpU_8Nj-V085equ0",$
"token": "UedIMFsI-6Y9Nq3oXgHcG72vtBFWBTqZx-1snG_0iLs",$
"status": "pending"$
},$
{$
"type": "tls-alpn-01",$
"url": "https://0.0.0.0:14000/chalZ/AHnQcRvZlFw6e7F6rrc7GofUMq7S8aIoeDileByYfEI",$
"token": "QhT4ejBEu6ZLl6pI1HsOQ3jD9piu__N0Hr8PaWaIPyo",$
"status": "pending"$
},$
{$
"type": "http-01",$
"url": "https://0.0.0.0:14000/chalZ/Q_qTTPDW43-hsPW3C60NHpGDm_-5ZtZaRfOYDsK3kY8",$
"token": "g5Y1WID1v-hZeuqhIa6pvdDyae7Q7mVdxG9CfRV2-t4",$
"status": "pending"$
}$
],$
"expires": "2025-08-01T15:23:14Z"$
}$
^@
This commit emits a log which output the TXT entry to create in case of
DNS-01. This is useful in cases you want to update your TXT entry
manually.
Example:
acme: foobar.pem.rsa: DNS-01 requires to set the "acme-challenge.example.com" TXT record to "7L050ytWm6ityJqolX-PzBPR0LndHV8bkZx3Zsb-FMg"
Files ending with '-t.h' are supposed to be used for structure
definitions and could be included in the same file to check API
definitions.
This patch removes TRACE_SOURCE from acme-t.h to avoid conflicts with
other TRACE_SOURCE definitions.
QUIC MUX may be initialized prior to handshake completion, when 0-RTT is
used. In this case, connection is flagged with CO_FL_EARLY_SSL_HS, which
is notably used by wait-for-hs http rule.
Early data may be subject to replay attacks. For this reason, haproxy
adds the header 'Early-data: 1' to all requests handled as TLS early
data. Thus the server can reject it if it is deemed unsafe. This header
injection is implemented by http-ana. However, it was not functional
with QUIC due to missing CO_FL_EARLY_DATA connection flag.
Fix this by ensuring that QUIC MUX sets CO_FL_EARLY_DATA when needed.
This is performed during qcc_recv() for STREAM frame reception. It is
only set if QC_CF_WAIT_HS is set, meaning that the handshake is not yet
completed. After this, the request is considered safe and Early-data
header is not necessary anymore.
This should fix github issue #3054.
This must be backported up to 3.2 at least. If possible, it should be
backported to all stable releases as well. On these versions, the
current patch relies on the following refactoring commit :
commit 0a53a008d0
MINOR: mux-quic: refactor wait-for-handshake support
Following the latest adjustment on session_add_conn() /
session_check_idle_conn(), detach muxes callbacks were rewritten for
private connection handling.
Nothing really fancy here : some more explicit comments and the removal
of a duplicate checks on idle conn status for muxes with true
multipexing support.
session_check_idle_conn() is called by muxes when a connection becomes
idle. It ensures that the session idle limit is not yet reached. Else,
the connection is removed from the session and it can be freed.
Prior to this patch, session_check_idle_conn() was compatible with a
NULL session argument. In this case, it would return true, considering
that no limit was reached and connection not removed.
However, this renders the function error-prone and subject to future
bugs. This patch streamlines it by ensuring it is never called with a
NULL argument. Thus it can now only returns true if connection is kept
in the session or false if it was removed, as first intended.
session_check_idle_conn() is called to flag a connection already
inserted in a session list as idle. If the session limit on the number
of idle connections (max-session-srv-conns) is exceeded, the connection
is removed from the session list.
In addition to the connection removal, session_check_idle_conn()
directly calls MUX destroy callback on the connection. This means the
connection is freed by the function itself and should not be used by the
caller anymore.
This is not practical when an alternative connection closure method
should be used, such as a graceful shutdown with QUIC. As such, remove
MUX destroy invokation : this is now the responsability of the caller to
either close or release immediately the connection.
Add a BUG_ON() on session_check_idle_conn() to ensure the connection is
not already flagged as CO_FL_SESS_IDLE.
This checks that this function is only called one time per connection
transition from active to idle. This is necessary to ensure that session
idle counter is only incremented one time per connection.
session_add_conn() uses three argument : connection and session
instances, plus a void pointer labelled as target. Typically, it
represents the server, but can also be a backend instance (for example
on dispatch).
In fact, this argument is redundant as <target> is already a member of
the connection. This commit simplifies session_add_conn() by removing
it. A BUG_ON() on target is extended to ensure it is never NULL.
This commit is the first one of a serie to refactor insertion of backend
private connection into the session list.
session_add_conn() is used to attach a connection into a session list.
Previously, this function would report an error if the connection
specified was already attached to another session. However, this case
currently never happens and thus can be considered as buggy.
Remove this check and replace it with a BUG_ON(). This allows to ensure
that session insertion remains consistent. The same check is also
transformed in session_check_idle_conn().
On stream detach on backend side, connection is inserted in the proper
server/session list to be able to reuse it later. If insertion fails and
the connection is idle, the connection can be removed immediately.
If this occurs on a QUIC connection, QUIC MUX implements graceful
shutdown to ensure the server is notified of the closure. However, the
connection instance is not freed. Change this to ensure that both
shutdown and release is performed.
This is preparation work for shared counters between co-processes. As
co-processes will need to share a common date. global_now_ms will be used
for that as it will point to the shm when sharing is enabled.
Thus in this patch we turn global_now_ms into a pointer (and adjust the
places where it is written to and read from, hopefully atomic operations
through pointer are already used so the change is trivial)
For now global_now_ms points to process-local _global_now_ms which is a
fallback for when sharing through the shm is not enabled.
75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing the shared
stats directly in counters struct") took care of renaming
counters_fe_shared_init() but we forgot counters_be_shared_init().
Let's fix that for consistency
As discussed in GH #3051, default-path is not taken into account when
loading files using lua-load-per-thread. In fact, the initial
hlua_load_state() (performed on first thread which parses the config)
is successful, but other threads run hlua_load_state() later based
on config hints which were saved by the first thread, and those config
hints only contain the file path provided on the lua-load-per-thread
config line, not the absolute one. Indeed, `default-path` directive
changes the current working directory only for the thread parsing the
configuration.
To fix the issue, when storing config hints under hlua_load_per_thread()
we now make sure to save the absolute file path for `lua-load-per-thread'
argument.
Thanks to GH user @zhanhb for having reported the issue
It may be backported to all stable versions.
Implement traces for the ACME protocol.
-dt acme:data:complete will dump every input and output buffers,
including decoded buffers before being converted to JWS.
It will also dump certificates in the traces.
-dt acme:user:complete will only dump the state of the task handler.
Released version 3.3-dev5 with the following main changes :
- BUG/MEDIUM: queue/stats: also use stream_set_srv_target() for pendconns
- DOC: list missing global QUIC settings
Complete list of global keywords with missing QUIC entries.
This could be backported to stable versions. This requires to take into
account the version of introduction for each keyword.
* limited-quic, introduced in 2.8
* no-quic, introduced in 2.8
* tune.quic.cc.cubic.min-losses, introduced in 3.1
Following c24de07 ("OPTIM: stats: store fast sharded counters pointers
at session and stream level") some crashes were observed in
connect_server():
#0 0x00000000007ba39c in connect_server (s=0x65117b0) at src/backend.c:2101
2101 _HA_ATOMIC_INC(&s->sv_tgcounters->connect);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-325.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 nss-softokn-freebl-3.67.0-3.el7_9.x86_64 pcre-8.32-17.el7.x86_64
(gdb) bt
#0 0x00000000007ba39c in connect_server (s=0x65117b0) at src/backend.c:2101
#1 0x00000000007baff8 in back_try_conn_req (s=0x65117b0) at src/backend.c:2378
#2 0x00000000006c0e9f in process_stream (t=0x650f180, context=0x65117b0, state=8196) at src/stream.c:2366
#3 0x0000000000bd3e51 in run_tasks_from_lists (budgets=0x7ffd592752e0) at src/task.c:655
#4 0x0000000000bd49ef in process_runnable_tasks () at src/task.c:889
#5 0x0000000000851169 in run_poll_loop () at src/haproxy.c:2834
#6 0x0000000000851865 in run_thread_poll_loop (data=0x1a03580 <ha_thread_info>) at src/haproxy.c:3050
#7 0x0000000000852a53 in main (argc=7, argv=0x7ffd592755f8) at src/haproxy.c:3637
Here the crash occurs during the atomic inc of a sv_tgcounters metric from
the stream pointer, which tells us the pointer is likely garbage.
In fact, we assign s->sv_tgcounters each time the stream target is set to
a valid server. For that we use stream_set_srv_target() helper which does
assigment for us. By reviewing the code, in turns out we forgot to call
stream_set_srv_target() in pendconn_dequeue(), where the stream target
is set to the server who picked the pendconn.
Let's fix the bug by using stream_set_srv_target() there.
No backport needed unless c24de07 is.
Released version 3.3-dev4 with the following main changes :
- CLEANUP: server: do not check for duplicates anymore in findserver()
- REORG: server: move findserver() from proxy.c to server.c
- MINOR: server: use the tree to look up the server name in findserver()
- CLEANUP: server: rename server_find_by_name() to server_find()
- CLEANUP: server: rename findserver() to server_find_by_name()
- CLEANUP: server: use server_find_by_name() where relevant
- CLEANUP: cfgparse: lookup proxy ID using existing functions
- CLEANUP: stream: lookup server ID using standard functions
- CLEANUP: server: simplify server_find_by_id()
- CLEANUP: server: add server_find_by_addr()
- CLEANUP: stream: use server_find_by_addr() in sticking_rule_find_target()
- CLEANUP: server: be sure never to compare src against a non-existing defsrv
- MEDIUM: proxy: take the defsrv out of the struct proxy
- MINOR: proxy: add checks for defsrv's validity
- MEDIUM: proxy: no longer allocate the default-server entry by default
- MEDIUM: proxy: register a post-section cleanup function
- MINOR: debug: report haproxy and operating system info in panic dumps
- BUG/MEDIUM: h3: do not overwrite interim with final response
- BUG/MINOR: h3: properly realloc buffer after interim response encoding
- BUG/MINOR: h3: ensure that invalid status code are not encoded (FE side)
- MINOR: qmux: change API for snd_buf FIN transmission
- BUG/MEDIUM: h3: handle interim response properly on FE side
- BUG/MINOR: h3: properly handle interim response on BE side
- BUG/MINOR: quic: Wrong source address use on FreeBSD
- MINOR: h3: remove unused outbuf in h3_resp_headers_send()
- BUG/MINOR: applet: Don't trigger BUG_ON if the tid is not on appctx init
- DEV: gdb: add a memprofile decoder to the debug tools
- MINOR: quic: Get rid of qc_is_listener()
- DOC: connection: explain the rules for idle/safe/avail connections
- BUG/MEDIUM: quic-be: CC buffer released from wrong pool
- BUG/MINOR: halog: exit with error when some output filters are set simultaneosly
- MINOR: cpu-topo: split cpu_dump_topology() to show its summary in show dev
- MINOR: cpu-topo: write thread-cpu bindings into trash buffer
- MINOR: debug: align output style of debug_parse_cli_show_dev with cpu_dump_topology
- MINOR: debug: add thread-cpu bindings info in 'show dev' output
- MINOR: quic: Remove pool_head_quic_be_cc_buf pool
- BUILD: debug: add missed guard USE_CPU_AFFINITY to show cpu bindings
- BUG/MEDIUM: threads: Disable the workaround to load libgcc_s on macOS
- BUG/MINOR: logs: fix log-steps extra log origins selection
- BUG/MINOR: hq-interop: fix FIN transmission
- MINOR: ssl: Add ciphers in ssl traces
- MINOR: ssl: Add curve id to curve name table and mapping functions
- MINOR: ssl: Add curves in ssl traces
- MINOR: ssl: Dump ciphers and sigalgs details in trace with 'advanced' verbosity
- MINOR: ssl: Remove ClientHello specific traces if !HAVE_SSL_CLIENT_HELLO_CB
- MINOR: h3: use smallbuf for request header emission
- MINOR: h3: add traces to h3_req_headers_send()
- BUG/MINOR: h3: fix uninitialized value in h3_req_headers_send()
- MINOR: log: explicitly ignore "log-steps" on backends
- BUG/MEDIUM: acme: use POST-as-GET instead of GET for resources
- BUG/MINOR mux-quic: apply correctly timeout on output pending data
- BUG/MINOR: mux-quic: ensure close-spread-time is properly applied
- MINOR: mux-quic: refactor timeout code
- MINOR: mux-quic: correctly implement backend timeout
- MINOR: mux-quic: disable glitch on backend side
- MINOR: mux-quic: store session in QCS instance
- MEDIUM: mux-quic: implement be connection reuse
- MINOR: mux-quic: do not reuse connection if app already shut
- MEDIUM: mux-quic: support backend private connection
- MINOR: acme: remove acme_req_auth() and use acme_post_as_get() instead
- BUG/MINOR: acme: allow "processing" in challenge requests
- CLEANUP: acme: fix wrong spelling of "resources"
- CLEANUP: ssl: Use only NIDs in curve name to id table
- MINOR: acme: add ACME to the haproxy -vv feature list
- BUG/MINOR: hlua: Skip headers when a receive is performed on an HTTP applet
- BUG/MEDIUM: applet: State inbuf is no longer full if input data are skipped
- BUG/MEDIUM: stconn: Fix conditions to know an applet can get data from stream
- BUG/MINOR: applet: Fix applet_getword() to not return one extra byte
- BUG/MEDIUM: Remove sync sends from streams to applets
- MINOR: applet: Add HTX versions for applet_input_data() and applet_output_room()
- MINOR: applet: Improve applet API to take care of inbuf/outbuf alloc failures
- MEDIUM: hlua: Update the tcp applet to use its own buffers
- MINOR: hlua: Fill the request array on the first HTTP applet run
- MINOR: hlua: Use the buffer instead of the HTTP message to get HTTP headers
- MEDIUM: hlua: Update the http applet to use its own buffers
- BUG/MEDIUM: hlua: Report to SC when data were consumed on a lua socket
- BUG/MEDIUM: hlua: Report to SC when output data are blocked on a lua socket
- MEDIUM: hlua: Update the socket applet to use its own buffers
- BUG/MEDIUM: dns: Reset reconnect tempo when connection is finally established
- MEDIUM: dns: Update the dns_session applet to use its own buffers
- CLEANUP: http-client: Remove useless indentation when sending request body
- MINOR: http-client: Try to send request body with headers if possible
- MINOR: http-client: Trigger an error if first response block isn't a start-line
- BUG/MINOR: httpclient-cli: Don't try to dump raw headers in HTX mode
- MINOR: httpclient-cli: Reset httpclient HTX buffer instead of removing blocks
- MEDIUM: http-client: Update the http-client applet to use its own buffers
- MEDIUM: log: Update the log applet to use its own buffers
- MEDIUM: sink: Update the sink applets to use their own buffers
- MEDIUM: peers: Update the peer applet to use its own buffers
- MEDIUM: promex: Update the promex applet to use their own buffers
- MINOR: applet: Add support for flags on applets with a flag about the new API
- MEDIUM: applet: Emit a warning when a legacy applet is spawned
- BUG/MEDIUM: logs: fix sess_build_logline_orig() recursion with options
- MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct
- CLEANUP: compiler: prefer char * over void * for pointer arithmetic
- CLEANUP: include: replace hand-rolled offsetof to avoid UB
- CLEANUP: peers: remove unused peer_session_target()
- OPTIM: stats: store fast sharded counters pointers at session and stream level
Following commit 75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing
the shared stats directly in counters struct"), in order to minimize the
impact of the recent sharded counters work, we try to push things a bit
further in this patch by storing and using "fast" pointers at the session
and stream levels when available to avoid costly indirections and
systematic "tgid" resolution (which can not be cached by the CPU due to
its THREAD-local nature).
Indeed, we know that a session/stream is tied to a given CPU, thanks to
this we know that the tgid for a given session/stream will never change.
Given that, we are able to store sharded frontend and listener counters
pointer at the session level (namely sess->fe_tgcounters and
sess->li_tgcounters), and once the backend and the server are selected,
we are also able to store backend and server sharded counters
pointer at the stream level (namely s->be_tgcounters and s->sv_tgcounters)
Everywhere we rely on these counters and the stream or session context is
available, we use the fast pointers it instead of the indirect pointers
path to make the pointer resolution a bit faster.
This optimization proved to bring a few percents back, and together with
the previous 75e480d10 commit we now fixed the performance regression (we
are back to back with 3.2 stats performance)
Since commit 7293eb68 ("MEDIUM: peers: use server as stream target") peer
session target always point to server in order to benefit from existing
server transport options.
Thanks to that, it is no longer necessary to have peer_session_target()
helper function, because all it does is return the pointer to the
server object. Let's get rid of that
The C standard specifies that it's undefined behavior to dereference
NULL (even if you use & right after). The hand-rolled offsetof idiom
&(((s*)NULL)->f) is thus technically undefined. This clutters the
output of UBSan and is simple to fix: just use the real offsetof when
it's available.
Note that there's no clear statement about this point in the spec,
only several points which together converge to this:
- From N3220, 6.5.3.4:
A postfix expression followed by the -> operator and an identifier
designates a member of a structure or union object. The value is
that of the named member of the object to which the first expression
points, and is an lvalue.
- From N3220, 6.3.2.1:
An lvalue is an expression (with an object type other than void) that
potentially designates an object; if an lvalue does not designate an
object when it is evaluated, the behavior is undefined.
- From N3220, 6.5.4.4 p3:
The unary & operator yields the address of its operand. If the
operand has type "type", the result has type "pointer to type". If
the operand is the result of a unary * operator, neither that operator
nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and
the result is not an lvalue. Similarly, if the operand is the result
of a [] operator, neither the & operator nor the unary * that is
implied by the [] is evaluated and the result is as if the & operator
were removed and the [] operator were changed to a + operator.
=> In short, this is saying that C guarantees these identities:
1. &(*p) is equivalent to p
2. &(p[n]) is equivalent to p + n
As a consequence, &(*p) doesn't result in the evaluation of *p, only
the evaluation of p (and similar for []). There is no corresponding
special carve-out for ->.
See also: https://pvs-studio.com/en/blog/posts/cpp/0306/
After this patch, HAProxy can run without crashing after building w/
clang-19 -fsanitize=undefined -fno-sanitize=function,alignment
This patch changes two instances of pointer arithmetic on void *
to use char * instead, to avoid UB. This is essentially to please
UB analyzers, though.
Between 3.2 and 3.3-dev we noticed a noticeable performance regression
due to stats handling. After bisecting, Willy found out that recent
work to split stats computing accross multiple thread groups (stats
sharding) was responsible for that performance regression. We're looking
at roughly 20% performance loss.
More precisely, it is the added indirections, multiplied by the number
of statistics that are updated for each request, which in the end causes
a significant amount of time being spent resolving pointers.
We noticed that the fe_counters_shared and be_counters_shared structures
which are currently allocated in dedicated memory since a0dcab5c
("MAJOR: counters: add shared counters base infrastructure")
are no longer huge since 16eb0fab31 ("MAJOR: counters: dispatch counters
over thread groups") because they now essentially hold flags plus the
per-thread group id pointer mapping, not the counters themselves.
As such we decided to try merging fe_counters_shared and
be_counters_shared in their parent structures. The cost is slight memory
overhead for the parent structure, but it allows to get rid of one
pointer indirection. This patch alone yields visible performance gains
and almost restores 3.2 stats performance.
counters_fe_shared_get() was renamed to counters_fe_shared_prepare() and
now returns either failure or success instead of a pointer because we
don't need to retrieve a shared pointer anymore, the function takes care
of initializing existing pointer.
Since ccc43412 ("OPTIM: log: use thread local lf_buildctx to stop pushing
it on the stack"), recursively calling sess_build_logline_orig(), which
may for instance happen when leveraging %ID (or unique-id fetch) for the
first time, would lead to undefined behavior because the parent
sess_build_logline_orig() build context was shared between recursive calls
(only one build ctx per thread to avoid pushing it on the stack for each
call)
In short, the parent build ctx would be altered by the recursive calls,
which is obviously not expected and could result in log formatting errors.
To fix the issue but still avoid polluting the stack with large lf_buildctx
struct, let's move the static 256 bytes build buffer out of the buildctx
so that the buildctx is now stored in the stack again (each function
invokation has its own dedicated build ctx). On the other hand, it's
acceptable to have only 1 256 bytes build buffer per thread because the
build buffer is not involved in recursives calls (unlike the build ctx)
Thanks to Willy and Vincent Gramer for spotting the bug and providing
useful repro.
It should be backported in 3.0 with ccc43412.
To motivate developers to support the new applets API, a warning is now
emitted when a legacy applet is spawned. To not flood users, this warning is
only emitted once per legacy applet. To do so, the applet flag
APPLET_FL_WARNED was added. It is set when the warning is emitted.
Note that test and set on this flag are not performed via atomic operations.
So it is possible to have more than one warning for a given applet if it is
spawned in same time on several threads. At worrst, there is one warning per
thread.
A new field was added in the applet structure to be able to set flags on the
applets The first one is related to the new API. APPLET_FL_NEW_API is set
for applets based on the new API. It was set on all HAProxy's applets.
Thanks to this patch, the promex applet is now using its own buffers.
.rcv_buf and .snd_buf callback functions are now defined to use the default
HTX functions. Parts to receive and send data have also been updated to use
the applet API and to remove any dependencies on the stream-connectors and
the channels.
Thanks to this patch, the peer applet is now using its own buffers. .rcv_buf
and .snd_buf callback functions are now defined to use the default raw
functions. The applet API is now used and any dependencies on the
stream-connectors and the channels were removed.
Thanks to this patch, the sink applets is now using their own buffers.
.rcv_buf and .snd_buf callback functions are now defined to use the default
raw functions. The applet API is now used and any dependencies on the
stream-connectors and the channels were removed.
Thanks to this patch, the log applet is now using its own buffers. .rcv_buf
and .snd_buf callback functions are now defined to use the default raw
functions. The applet API is now used and any dependencies on the
stream-connectors and the channels were removed.
Thanks to this patch, the http-client applet is now using its own buffers.
.rcv_buf and .snd_buf callback functions are now defined to use the default
HTX functions. Parts to receive and send data have also been updated to use
the applet API and to remove any dependencies on the stream-connectors and
the channels.
In the CLI I/O handler interacting with the HTTP client, in HTX mode, after
a dump of the HTX message, data must be removed. Instead of removng all
blocks one by one, we can call htx_reset() because all the message must be
flushed.
In the CLI I/O handler interacting with the HTTP client, we must not try to
push raw headers in HTX mode, because there is no raw data in this
mode. This prevent the HTX dump at the end of the I/O handle.
It is a 3.3-specific issue. No backport needed.
The first HTX block of a response must be a start-line. There is no reason
to wait for something else. And if there are output data in the response
channel buffer, it means we must found the start-line.
There is no reason to yield after sending the request headers, except if the
request was fully sent. If there is a payload, it is better to send it as
well. However, when the whole request was sent, we can leave the I/O handler.
Thanks to this patch, the dns_session applet is now using its own
buffers. .rcv_buf and .snd_buf callback functions are now defined to use the
default raw functions. Functions to receive and send data have also been
updated to use the applet API and to remove any dependencies on the
stream-connectors and the channels.
The issue was introduced by commit 27236f221 ("BUG/MINOR: dns: add tempo
between 2 connection attempts for dns servers"). In this patch, to delay the
reconnection, a timer is used on the appctx when it is created. This
postpones the appctx initialization. However, once initialized, the
expiration time of the underlying task is not reset. So, it is always
considered as expired and the appctx is woken up in loop.
The fix is quite simple. In dns_session_init(), the expiration time of the
appctx's task is alwaus set to TICK_ETERNITY.
This patch must be backported everywhere the commit above was backported. So
as far as 2.8 for now but possibly to all stable versions.
Thanks to this patch, the lua cosocket applet is now using its own
buffers. .rcv_buf and .snd_buf callback functions are now defined to use the
default raw functions. Functions to receive and send data have also been
updated to use the applet API and to remove any dependencies on the
stream-connectors and the channels.
It is a fix similar to the previous one ("BUG/MEDIUM: hlua: Report to SC
when data were consumed on a lua socket"), but for the write side. The
writer must notify the cosocket it needs more space in the request buffer to
produce more data by calling sc_need_room(). Otherwise, there is nothing to
prevent to wake the cosocket applet up again and again.
This patch must be backported as far as 2.8, and maybe to 2.6 too.
The lua cosocket are quite strange. There is an applet used to handle the
connection and writer and readers subscribed on it to write or read
data. Writers and readers are tasks woken up by the cosocket applet when
data can be consumed or produced, depending on the channels buffers
state. Then the cosocket applet is woken up by writers and readers when read
or write events were performed.
It means the cosocket applet has only few information on what was produced
or consumed. It is the writers and readers responsibility to notify any
blocking. Among other things, the readers must take care to notify the
stream on top of the cosocket applet that some data was consumed. Otherwise,
it may remain blocked, waiting for a write event (a write event from the
stream point of view is a read event from the cosocket point of view).
Thie patch must be backported as far as 2.8, and maybe to 2.6 too.
Thanks to this patch, the lua HTTP applet is now using its own buffers.
.rcv_buf and .snd_buf callback functions are now defined to use the default
HTX functions. Functions to receive and send data have also been updated to
use the applet API and to remove any dependencies on the stream-connectors
and the channels.
hlua_http_get_headers() function was using the HTTP message from the stream
TXN to retrieve headers from a message. However, this will be an issue to
update the lua HTTP applet to use its own buffers. Indeed, in that case,
information from the channels will be unavailable. So now,
hlua_http_get_headers() is now using a buffer containing an HTX message. It
is just an API change bacause, internally, the function was already
manipulation an HTX message.
When a lua HTTP applet is created, a "request" object is created, filled
with the request information (method, path, headers...), to be able to
easily retrieve these information from the script. However, this was done
when thee appctx was created, retrieving the info from the request channel.
To be ale to update the applet to use its own buffer, it is now performed on
the first applet run. Indead, when the applet is created, the info are not
forwarded yet and should not be accessed. Note that for now, information are
still retrieved from the channel.
Thanks to this patch, the lua TCP applet is now using its own buffers.
.rcv_buf and .snd_buf callback functions are now defined to use the default
raw functions. Other changes are quite light. Mainly, end of stream and
errors are reported on the appctx instead of the stream-endpoint descriptor.
applet_get_inbuf() and applet_get_outbuf() functions were not testing if the
buffers were available. So, the caller had to check them before calling one
of these functions. It is not really handy. So now, these functions take
care to have a fully usable buffer before returning. Otherwise NULL is
returned.
It will be useful for HTX applets because availale data in the input buffer and
available space in the output buffer are computed from the HTX message and not
the buffer itself. So now, applet_htx_input_data() and applet_htx_output_room()
functions can be used.
When the applet API was reviewed to use dedicated buffers, the support for
sends from the streams to applets was added. Unfortunately, it was not a
good idea because this way it is possible to deliver data to an applet and
release it just after, truncated data. Indeed, the release stage for applets
is related to the stream release itself. However, unlike the multiplexers,
the applets cannot survive to a stream for now.
So, for now, the sync sends from the streams is removed for applets, waiting
for a better way to handle the applets release stage.
Note that this only concerns applets using their own buffers. And of now,
the bug is harmless because all refactored applets are on server side and
consume data first. But this will be an issue with the HTTP client.
This patch should be backported as far as 3.0 after a period of observation.
applet_getword() function is returning one extra byte when a string is
returned because the "ret" variable is not reset before the loop on the
data. The patch also fixes applet_getline().
It is a 3.3-specific issue. No need to backport.
sc_is_send_allowed() function is used to know if an applet is able to
receive data from the stream. But this function was designed for applets
using the channels buffer. It is not adapted to applets using their own
buffers.
when the SE_FL_WAIT_DATA flag is set, it means the applet is waiting for
more data and should not be woken up without new data. For applets using
channels buffer, just testing the flag is enough because process_stream()
will remove if when more data will be available. For applets using their own
buffers, it is more complicated. Some data may be blocked in the output
channel buffer. In that case, and when the applet input buffer can receive
daa, the applet can be woken up.
This patch must be backported as far as 3.0 after a period of observation.
When data are skipped from the input buffer of an applet, we must take care
to notify the input buffer is no longer full. Otherwise, this could prevent
the stream to push data to the applet.
It is 3.3-specific. No backport needed.
When an HTTP applet tries to retrieve data, the request headers are still in
the buffer. But, instead of being silently removed, their size is removed
from the amount of data retrieved. When the request payload is fully
retrieved, it is not an issue. But it is a problem when a length is
specified. The data are shorten from the headers size.
So now, we take care to silently remove headers.
This patch must be backported to all stable versions.
The curve name to curve id mapping table was built out of multiple
internal tables found in openssl sources, namely the 'nid_to_group'
table found in 'ssl/t1_lib.c' which maps openssl specific NIDs to public
IANA curve identifiers. In this table, there were two instances of
EVP_PKEY_XXX ids being used while all the other ones are NID_XXX
identifiers.
Since the two EVP_PKEY are actually equal to their NID equivalent in
'include/openssl/evp.h' we can use NIDs all along for better coherence.
Allow the "processing" status in the challenge object when requesting
to do the challenge, in addition to "pending".
According to RFC 8555 https://datatracker.ietf.org/doc/html/rfc8555/#section-7.1.6
Challenge objects are created in the "pending" state. They
transition to the "processing" state when the client responds to the
challenge (see Section 7.5.1)
However some CA could respond with a "processing" state without ever
transitioning to "pending".
Must be backported to 3.2.
If a backend connection is private, it should not be reused outside of
its original attached session. As such, on stream detach operation, such
connection is never inserted into server idle/avail list. Instead, it is
stored directly on the session.
The purpose of this commit is to implement proper handling of private
backend connections via QUIC multiplexer.
QUIC connection graceful closure is performed in two steps. First, the
application layer is closed. In the context of HTTP/3, this is done with
a GOAWAY frame emission, which forbids opening of new streams. Then the
whole connection is terminated via CONNECTION_CLOSE which is the final
emitted frame.
This commit ensures that when app layer is shut for a backend
connection, this connection is removed from either idle or avail server
tree. The objective is to prevent stream layer to try to reuse a
connection if no new stream can be attached on it.
New BUG_ON checks are inserted in qmux_strm_attach() and h3_attach() to
ensure that this assertion is always true.
Implement support for QUIC connection reuse on the backend side. The
main change is done during detach stream operation. If a connection is
idle, it is inserted in the server list. Else, it is stored in the
server avail tree if there is room for more streams.
For non idle connection, qmux_avail_streams() is reused to detect that
stream flow-control limit is not yet reached. If this is the case, the
connection is not inserted in the avail tree, so it cannot be reuse,
even if flow-control is unblocked later by the peer. This latter point
could be improved in the future.
Note that support for QUIC private connections is still missing. Reuse
code will evolved to fully support this case.
Add a new <sess> member into QCS structure. It is used to store the
parent session of the stream on attach operation. This is only done for
backend side.
This new member will become necessary when connection reuse will be
implemented. <owner> member of connection is not suitable as it could be
set to NULL, notably after a session_add_conn() failure.
Also, a single BE conn can be shared along different session instance,
in particular when using aggressive/always reuse mode. Thus it is
necessary to linked each QCS instance with its session.
For now, QUIC glitch limit counter is only available on the frontend
side. Thus, disable incrementation on the backend side for now. Also,
session is only available as conn <owner> reliably on the frontend side,
so session_add_glitch_ctr() operation is also securised.
qcc_refresh_timeout() is the function called on QUIC MUX activity. Its
purpose is to update the timeout by selecting the correct value
depending on the connection state.
Prior to this patch, backend connections were mostly ignored by the
function. However, the default server timeout was selecting as a
fallback. This is incompatible with backend connections reuse.
This patch fixes timeout applied on backend connections. Only values
specific to frontend which are http-request and http-keep-alive timeouts
are now ignored for a backend connection. Also, fallback timeout is only
used for frontend connections.
This patch ensures that an idle backend connection won't be deleted due
to server timeout. This is necessary for proper connection reuse which
will be implemented in a future patch.
This commit is a small reorganization of condition used into
qcc_refresh_timeout(). Its objective is to render the code more logical
before the next patch which will ensure that timeout is properly set for
backend connections.
If a connection remains on a proxy currently disabled or stopped, a
special spread timeout is set if active close is configured. For QUIC
MUX, this is set via qcc_refresh_timeout() as with all other timeout
values.
Fix this closing timeout setting : it is now used as an override to any
other timeout that may have been chosen if calculated spread time is
lower than the previously selected value. This is done for backend
connections as well.
This should be backported up to 2.6 after a period of observation.
When no stream is attached, mux layer is responsible to maintain a
timeout. The first criteria is to apply client/server timeout if there
is still data waiting for emission.
Previously, <hreq> qcc member was used to determine this state. However,
this only covers bidirectional streams. Fix this by testing if
<send_list> is empty or not. This is enough to take into account both
bidi and uni streams.
Theorically, this should be backported to every stable versions.
However, send-list is not available on 2.6 and there is no alternative
to quickly determine if there is waiting output data. Thus, it's better
to backport it up to 2.8 only.
The requests that checked the status of the challenge and the retrieval
of the certificate were done using a GET.
This is working with letsencrypt and other CA providers, but it might
not work everywhere. RFC 8555 specifies that only the directory and
newNonce resources MUST work with a GET requests, but everything else
must use POST-as-GET.
Must be backported to 3.2.
"log-steps" was already ignored if directly defined in a backend section,
however, when defined in a defaults section it was inherited to all
proxies no matter their capability (ie: including backends).
As configurations often contain more backends than frontends, this would
result in wasted memory given that the log-steps setting is only
considered on frontends.
Let's fix that by preventing the inheritance from defaults section to
anything else than frontends. Also adjust the documentation to mention
that the setting in not relevant for backends.
Due to the introduction of smallbuf usage for HTTP/3 headers emission,
ret variable may be used uninitialized if buffer allocation fails due to
not enough room in QUIC connection window.
Fix this by setting ret value to 0.
Function variable declaration are also adjusted so that the pattern is
similar to h3_resp_headers_send(). Finally, outbuf buffer is also
removed as it is now unused.
No need to backport.
Similarly to HTTP/3 response encoding, a small buffer is first allocated
for the request encoding on the backend side. If this is not sufficient,
the smallbuf is replaced by a standard buffer and encoding is restarted.
This is useful to reduce the window usage over a connection of smaller
requests.
SSL libraries like wolfSSL that don't have the clienthello callback
mechanism enabled do not need to have the traces that are only called
from the said callback.
The code added to parse the ciphers relied on a function that wes not
defined in wolfSSL (SSL_CIPHER_find).
The contents of the extensions were only dumped with verbosity
'complete' which meant that the 'advanced' verbosity was pretty much
useless despite what its name implies (it was the same as the 'simple'
one).
The 'advanced' verbosity is now the "maximum" one, using 'complete'
would not add any extra information yet, but it leaves more room for
some actually large traces to be dumped later on (some complete
ClientHello dumps for instance).
The SSL libraries like OpenSSL for instance do not seem to actually
provide a public mapping between IANA defined curve IDs and curve names,
or even a mapping between curve IDs and internal NIDs.
This new table regroups all those information in a single table so that
we can convert curve names (be it SECG or NIST format) to curve IDs or
NIDs.
The previously existing 'curves2nid' function now uses the new table,
and a new 'curveid2str' one is added.
Since the following patch, app_ops layer is now responsible to report
that HTX block was the last transmitted so that FIN STREAM can be set.
This is mandatory to properly support HTTP 1xx interim responses.
f349df44b4
MINOR: qmux: change API for snd_buf FIN transmission
This change was correctly implemented in HTTP/3 code, however an issue
appeared on hq-interop transcoder in case zero-copy DATA transfer is
performed when HTX buffer is swapped. If this occured during the
transfer of the last HTX block, EOM is not detected and thus STREAM FIN
is never set.
Most of the times, QMUX shut callback is called immediately after. This
results in an emission of a RESET_STREAM to the client, which prevents
the data transfer.
To fix this, use the same method as HTTP/3 : HTX EOM flag status is
checked before any transfer, thus preserving it even after a zero-copy.
Criticity of this bug is low as hq-interop is experimental and is mostly
used for interop testing.
This should fix github issue #3038.
This patch must be backported wherever the above one is.
Willy noticed that it was not possible to select extra log origins using
log-steps directive. Extra origins are the one registered using
log_orig_register() such as http-req.
Reason was the error path was always executed during extra log origin
matching for log-steps parser, while it should only be executed if no
match was found.
It should be backported to 3.1.
Don't use the workaround to load libgcc_s on macOS. It is not needed
there, and it causes issues, as recent macOS dislike processes that fork
after threads where created (and the workaround creates a temporary
thread). This fixes crashes on macOS at least when using master-worker,
and using the system resolver.
This should fix Github issue #3035
This should be backported up to 2.8.
Not all platforms support thread-cpu bindings, so let's put
cpu_topo_dump_summary() under USE_CPU_AFFINITY guards.
Only needs to be backported if 1cc0e023ce ("MINOR: debug: add thread-cpu
bindings info in 'show dev' output") is backported.
This patch impacts the QUIC frontends. It reverts this patch
MINOR: quic-be: add a "CC connection" backend TX buffer pool
which adds <pool_head_quic_be_cc_buf> new pool to allocate CC (connection closed state)
TX buffers with bigger object size than the one for <pool_head_quic_cc_buf>.
Indeed the QUIC backends must be able to send at least 1200 bytes Initial packets.
For now on, both the QUIC frontends and backend use the same pool with
MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU)(1252 bytes) as object size.
Align titles style of debug_parse_cli_show_dev() with
cpu_dump_topology(). We will call the latter inside of
debug_parse_cli_show_dev() to show thread-cpu bindings info.
cpu_dump_topology() prints details about each enabled CPU and a summary with
clusters info and thread-cpu bindings. The latter is often usefull for
debugging and we want to add it in the 'show dev' output.
So, let's split cpu_dump_topology() in two parts: cpu_topo_debug() to print the
details about each enabled CPU; and cpu_topo_dump_summary() to print only the
summary.
In the next commit we will modify cpu_topo_dump_summary() to write into local
trash buffer and it could be easily called from debug_parse_cli_show_dev().
Exit with an error if multiple output filters (-ic, -srv, -st, -tc, -u*, etc.)
are used at the same time.
halog is designed to process and display output for only one filter at a time.
Using multiple filters simultaneously can cause a crash because the program is
not designed to manage multiple, separate result sets (e.g., one for
IP counts, another for URLs).
Supporting simultaneous filters would require a redesign to collect entries for
each filter in separate ebtree. This would negatively impact performance and is
not requested for the moment. This patch prevents the crash by checking filter
combinations just after the command line parsing.
This issue was reported in GitHUB #3031.
This should be backported in all stable versions.
The "connection close state" TX buffer is used to build the datagram with
basically a CONNECTION_CLOSE frame to notify the peer about the connection
closure. It allows the quic_conn memory release and its replacement by a lighter
quic_cc_conn struct.
For the QUIC backend, there is a dedicated pool to build such datagrams from
bigger TX buffers. But from quic_conn_release(), this is the pool dedicated
to the QUIC frontends which was used to release the QUIC backend TX buffers.
This patch simply adds a test about the target of the connection to release
the "connection close state" TX buffers from the correct pool.
No backport needed.
It's super difficult to find the rules that operate idle conns depending
on their idle/safe/avail/private status. Some are in lists, others not.
Some are in trees, others not. Some have a flag set, others not. This
documents the rules before the definitions in connection-t.h. It could
even be backported to help during backport sessions.
Replace all calls to qc_is_listener() (resp. !qc_is_listener()) by calls to
objt_listener() (resp. objt_server()).
Remove qc_is_listener() implement and QUIC_FL_CONN_LISTENER the flag it
relied on.
When an appctx is initialized, there is a BUG_ON() to be sure the appctx is
really initialized on the right thread to avoid bugs on the thread
affinity. However, it is possible to not choose the thread when the appctx
is created and let it starts on any thread. In that case, the thread
affinity is set when the appctx is initialized. So, we must take cate to not
trigger the BUG_ON() in that case.
For now, we never hit the bug because the thread affinity is always set
during the appctx creation.
This patch must be backport as far as 2.8.
The bug is a listener only one, and only occured on FreeBSD.
The FreeBSD issue has been reported here:
https://forums.freebsd.org/threads/quic-http-3-with-haproxy.98443/
where QUIC traces could reveal that sendmsg() calls lead to EINVAL
syscall errnos.
Such a similar issue could be reproduced from a FreeBSD 14-2 VM
with reg-tests/quic/retry.vtc as reg test.
As noted by Olivier, this issue could be fixed within the VM binding
the listener socket to INADDR_ANY.
That said, the symptoms are not exactly the same as the one reporte by the user.
What could be observed from such a VM is that if the first recvmsg() call
returns the datagram destination address, and if the listener
listening address is bound to a specific address, the calls to
sendmsg() fail because of the IP_SENDSRCADDR ip option value
set by cmsg_set_saddr(). According to the ip(4) freebsd manual
such an IP options must be used if the listening socket is
bound to a specific address. It is to be noted that into a VM
the first call to recvmsg() of the first connection does not return the datagram
destination address. This leads the first quic_conn to be initialized without
->local_addr value. This is this value which is used by IP_SENDSRCADDR
ip option. In this case, the sendmsg() calls (without IP_SENDSRCADDR)
never fail. The issue appears at the second condition.
This patch replaces the conditions to use IP_SENDSRCADDR to a call to
qc_may_use_saddr(). This latter also checks that the listener listening
address is not INADDR_ANY to allow the use of the source address.
It is generalized to all the OSes. Indeed, there is no reason to set the source
address when the listener is bound to a specific address.
Must be backported as far as 2.8.
On backend side, H3 layer is responsible to decode a HTTP/3 response
into an HTX message. Multiple responses may be received on a single
stream with interim status codes prior to the final one.
h3_resp_headers_to_htx() is the function used solely on backend side
responsible for H3 response to HTX transcoding. This patch extends it to
be able to properly support interim responses. When such a response is
received, the new flag H3_SF_RECV_INTERIM is set. This is converted to
QMUX qcs flag QC_SF_EOI_SUSPENDED.
The objective of this latter flag is to prevent stream EOI to be
reported during stream rcv_buf callback, even if HTX message contains
EOM and is empty. QC_SF_EOI_SUSPENDED will be cleared when the final
response is finally converted, which unblock stream EOI notification for
next rcv_buf invocations. Note however that HTX EOM is untouched : it is
always set for both interim and final response reception.
As a minor adjustment, HTX_SL_F_BODYLESS is always set for interim
responses.
Contrary to frontend interim response handling, a flag is necessary on
QMUX layer. This is because H3 to HTX transcoding and rcv_buf callback
are two distinct operations, called under different context (MUX vs
stream tasklet).
Also note that H3 layer has two distinct flags for interim response
handling, one only used as a server (FE side) and the other as a client
(BE side). It was preferred to used two distinct flags which is
considered less error-prone, contrary to a single unified flag which
would require to always set the proxy side to ensure it is relevant or
not.
No need to backport.
On frontend side, HTTP/3 layer is responsible to transcode an HTX
response message into HTTP/3 HEADERS frame. This operations is handled
via h3_resp_headers_send().
Prior to this patch, if HTX EOM was encountered in the HTX message after
response transcoding, <fin> was reported to the QMUX layer. This will in
turn cause FIN stream bit to be set when the response is emitted.
However, this is not correct as a single HTX response can be constitued
of several interim message, each delimited by EOM block.
Most of the time, this bug will cause the client to close the connection
as it is invalid to receive an interim response with FIN bit set.
Fixes this by now properly differentiate interim and final response.
During interim response transcoding, the new flag H3_SF_SENT_INTERIM
will be set, which will prevent <fin> to be reported. Thus, <fin> will
only be notified for the final response.
This must be backported up to 2.6. Note that it relies on the previous
patch which also must be taken.
Previous patches have fixes interim response encoding via
h3_resp_headers_send(). However, it is still necessary to adjust h3
layer state-machine so that several successive HTTP responses are
accepted for a single stream.
Prior to this, QMUX was responsible to decree that the final HTX message
was encoded so that FIN stream can be emitted. However, with interim
response, MUX is in fact unable to properly determine this. As such,
this is the responsibility of the application protocol layer. To reflect
this, app_ops snd_buf callback is modified so that a new output argument
<fin> is added to it.
Note that for now this commit does not bring any functional change.
However, it will be necessary for the following patch. As such, it
should be backported prior to it to every versions as necessary.
On frontend side, H3 layer transcodes HTX status code into HTTP/3
HEADERS frame. This is done by calling qpack_encode_int_status().
Prior to this patch, the latter function was also responsible to reject
an invalid value, which guarantee that only valid codes are encoded
(between 100 and 999 values). However, this is not practical as it is
impossible to differentiate between an invalid code error and a buffer
room exhaustation.
Changes this so that now HTTP/3 layer first ensures that HTX code is
valid. The stream is closed with H3_INTERNAL_ERROR if invalid value is
present. Thus, qpack_encode_int_status() will only report an error due
to buffer room exhaustion. If a small buffer is used, a standard buffer
will be reallocated which should be sufficient to encode the response.
The impact of this bug is minimal. Its main benefit is code clarity,
while also removing an unnecessary realloc when confronting with an
invalid HTTP code.
This should be backported at least up to 3.1. Prior to it, smallbuf
mechanism isn't present, hence the impact of this patch is less
important. However, it may still be backported to older versions, which
should facilitate picking patches for HTTP 1xx interim response support.
Previous commit fixes encoding of several following HTTP response
message when interim status codes are first reported. However,
h3_resp_headers_send() still was unable to interrupt encoding if output
buffer room was not sufficient. This case may be likely because small
buffers are used for headers encoding.
This commit fixes this situation. If output buffer is not empty prior to
response encoding, this means that a previous interim response message
was already encoded before. In this case, and if remaining space is not
sufficient, use buffer release mechanism : this allows to restart
response encoding by using a newer buffer. This process has already been
used for DATA and trailers encoding.
This must be backported up to 2.6. However, note that buffer release
mechanism is not present for version 2.8 and lower. In this case, qcs
flag QC_SF_BLK_MROOM should be enough as a replacement.
An HTTP response may contain several interim response message prior (1xx
status) to a final response message (all other status codes). This may
cause issues with h3_resp_headers_send() called for response encoding
which assumes that it is only call one time per stream, most notably
during output buffer handling.
This commit fixes output buffer handling when h3_resp_headers_send() is
called multiple times due to an interim response. Prior to it, interim
response was overwritten with newer response message. Most of the time,
this resulted in error for the client due to QPACK decoding failure.
This is now fixed so that each response is encoded one after the other.
Note that if encoding of several responses is bigger than output buffer,
an error is reported. This can definitely occurs as small buffer are
used during header encoding. This situation will be improved by the next
patch.
This must be backported up to 2.6.
The goal is to help figure the OS version (kernel and userland), any
virtualization/containers, and the haproxy version and build features.
Sometimes even reporters themselve can be mistaken about the running
version or environment. Also printing this at the top hepls draw a
visual delimitation between warnings and panic. Now we get something
like this:
PANIC! Thread 1 is about to kill the process.
HAProxy info:
version: 3.3-dev3-c863c0-18
features: +51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY (...)
Operating system info:
virtual machine: no
container: no
kernel: Linux 6.1.131 #1 SMP PREEMPT_DYNAMIC Fri Mar 14 01:04:55 CET 2025 x86_64
userland: Slackware 15.0 x86_64
* Thread 1 : id=0x7f615a8775c0 act=1 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
1/1 stuck=0 prof=0 harmless=0 isolated=0
cpu_ns: poll=1835010197 now=1835066102 diff=55905
(...)
For listen/frontend/backend, we now want to be able to clean up the
default-server directive that's no longer used past the end of the
section. For this we register a post-section function and perform the
cleanup there.
The default-server entry used to always be allocated. Now we'll postpone
its allocation for the first time we need it, i.e. during a "default-server"
directive, or when inheriting a defaults section which has one. The memory
savings are significant, on a large configuration with 100k backends and
no default-server directive, the memory usage dropped from 800MB RSS to
420MB (380 MB saved). It should be possible to also address configs using
default-server by releasing this entry when leaving the proxy section,
which is not done yet.
Now we only copy the default server's settings if such a default server
exists, otherwise we only initialize it. At the moment it always exists.
The change is mostly performed in srv_settings_cpy() since that's where
each caller passes through, and there's no point duplicating that test
everywhere.
The server struct has gone huge over time (~3.8kB), and having a copy
of it in the defsrv section of the struct proxy costs a lot of RAM,
that is not needed anymore at run time.
This patch replaces this struct with a dynamically allocated one. The
field is allocated and initialized during alloc_new_proxy() and is
freed when the proxy is destroyed for now. But the goal will be to
support freeing it after parsing the section.
The test in srv_ssl_settings_cpy() comparing src to the server's proxy's
default server does work but it's a subtle trap. Indeed, no check is made
on srv->proxy to be valid, and this only works because the compiler is
comparing pointer offsets. During the boot, it's common to have NULL here
in srv->proxy and of course in this case srv does not match that value
which is NULL plus epsilon. But when trying to turn defsrv to a dynamic
pointer instead, then the compiler is forced to dereference this NULL
srv->proxy and dies during init.
Let's always add the null check for srv->proxy before the test to avoid
this situation.
No backport is needed since the problem cannot happen yet.
At a few places we're seeing some open-coding of the same function, likely
because it looks overkill for what it's supposed to do, due to extraneous
tests that are not needed (e.g. check of the backend's PR_CAP_BE etc).
Let's just remove all these superfluous tests and inline it so that it
feels more suitable for use everywhere it's needed.
The server lookup in sticking_rule_find_target() uses an open-coded tree
search while we have a function for this server_find_by_id(). In addition,
due to the way it's coded, the stick-table lock also covers the server
lookup by accident instead of being released earlier. This is not a real
problem though since such feature is rarely used nowadays.
Let's clean all this stuff by first retrieving the ID under the lock and
then looking up the corresponding server.
The code used to detect proxy id conflicts uses an open-coded lookup
in the ID tree which is not necessary since we already have functions
for this. Let's switch to that instead.
This function doesn't just look at the name but also the ID when the
argument starts with a '#'. So the name is not correct and explains
why this function is not always used when the name only is needed,
and why the list-based findserver() is used instead. So let's just
call the function "server_find()", and rename its generation-id based
cousin "server_find_unique()".
Let's just use the tree-based lookup instead of walking through the list.
This function is used to find duplicates in "track" statements and a few
such places, so it's important not to waste too much time on large setups.
findserver() used to check for duplicate server names. These are no
longer accepted in 3.3 so let's get rid of that test and simplify the
code. Note that the function still only uses the list instead of the
tree.
Released version 3.3-dev3 with the following main changes :
- BUG/MINOR: quic-be: Wrong retry_source_connection_id check
- MEDIUM: sink: change the sink mode type to PR_MODE_SYSLOG
- MEDIUM: server: move _srv_check_proxy_mode() checks from server init to finalize
- MINOR: server: move send-proxy* incompatibility check in _srv_check_proxy_mode()
- MINOR: mailers: warn if mailers are configured but not actually used
- BUG/MEDIUM: counters/server: fix server and proxy last_change mixup
- MEDIUM: server: add and use a separate last_change variable for internal use
- MEDIUM: proxy: add and use a separate last_change variable for internal use
- MINOR: counters: rename last_change counter to last_state_change
- MINOR: ssl: check TLS1.3 ciphersuites again in clienthello with recent AWS-LC
- BUG/MEDIUM: hlua: Forbid any L6/L7 sample fetche functions from lua services
- BUG/MEDIUM: mux-h2: Properly handle connection error during preface sending
- BUG/MINOR: jwt: Copy input and parameters in dedicated buffers in jwt_verify converter
- DOC: Fix 'jwt_verify' converter doc
- MINOR: jwt: Rename pkey to pubkey in jwt_cert_tree_entry struct
- MINOR: jwt: Remove unused parameter in convert_ecdsa_sig
- MAJOR: jwt: Allow certificate instead of public key in jwt_verify converter
- MINOR: ssl: Allow 'commit ssl cert' with no privkey
- MINOR: ssl: Prevent delete on certificate used by jwt_verify
- REGTESTS: jwt: Add test with actual certificate passed to jwt_verify
- REGTESTS: jwt: Test update of certificate used in jwt_verify
- DOC: 'jwt_verify' converter now supports certificates
- REGTESTS: restrict execution to a single thread group
- MINOR: ssl: Introduce new smp_client_hello_parse() function
- MEDIUM: stats: add persistent state to typed output format
- BUG/MINOR: httpclient: wrongly named httpproxy flag
- MINOR: ssl/ocsp: stop using the flags from the httpclient CLI
- MEDIUM: httpclient: split the CLI from the actual httpclient API
- MEDIUM: httpclient: implement a way to use directly htx data
- MINOR: httpclient/cli: add --htx option
- BUILD: dev/phash: remove the accidentally committed a.out file
- BUG/MINOR: ssl: crash in ssl_sock_io_cb() with SSL traces and idle connections
- BUILD/MEDIUM: deviceatlas: fix when installed in custom locations.
- DOC: deviceatlas build clarifications
- BUG/MINOR: ssl/ocsp: fix definition discrepancies with ocsp_update_init()
- MINOR: proto-tcp: Add support for TCP MD5 signature for listeners and servers
- BUILD: cfgparse-tcp: Add _GNU_SOURCE for TCP_MD5SIG_MAXKEYLEN
- BUG/MINOR: proto-tcp: Take care to initialized tcp_md5sig structure
- BUG/MINOR: http-act: Fix parsing of the expression argument for pause action
- MEDIUM: httpclient: add a Content-Length when the payload is known
- CLEANUP: ssl: Rename ssl_trace-t.h to ssl_trace.h
- MINOR: pattern: add a counter of added/freed patterns
- CI: set DEBUG_STRICT=2 for coverity scan
- CI: enable USE_QUIC=1 for OpenSSL versions >= 3.5.0
- CI: github: add an OpenSSL 3.5.0 job
- CI: github: update the stable CI to ubuntu-24.04
- BUG/MEDIUM: quic: SSL/TCP handshake failures with OpenSSL 3.5
- CI: github: update to OpenSSL 3.5.1
- BUG/MINOR: quic: Missing TLS 1.3 QUIC cipher suites and groups inits (OpenSSL 3.5 QUIC API)
- BUG/MINOR: quic-be: Malformed coalesced Initial packets
- MINOR: quic: Prevent QUIC backend use with the OpenSSL QUIC compatibility module (USE_OPENSS_COMPAT)
- MINOR: reg-tests: first QUIC+H3 reg tests (QUIC address validation)
- MINOR: quic-be: Set the backend alpn if not set by conf
- MINOR: quic-be: TLS version restriction to 1.3
- MINOR: cfgparse: enforce QUIC MUX compat on server line
- MINOR: server: support QUIC for dynamic servers
- CI: github: skip a ssl library version when latest is already in the list
- MEDIUM: resolvers: switch dns-accept-family to "auto" by default
- BUG/MINOR: resolvers: don't lower the case of binary DNS format
- MINOR: resolvers: do not duplicate the hostname_dn field
- MINOR: proto-tcp: Register a feature to report TCP MD5 signature support
- BUG/MINOR: listener: really assign distinct IDs to shards
- MINOR: quic: Prevent QUIC build with OpenSSL 3.5 new QUIC API version < 3.5.1
- BUG/MEDIUM: quic: Crash after QUIC server callbacks restoration (OpenSSL 3.5)
- REGTESTS: use two haproxy instances to distinguish the QUIC traces
- BUG/MEDIUM: http-client: Don't wake http-client applet if nothing was xferred
- BUG/MEDIUM: http-client: Properly inc input data when HTX blocks are xferred
- BUG/MEDIUM: http-client: Ask for more room when request data cannot be xferred
- BUG/MEDIUM: http-client: Test HTX_FL_EOM flag before commiting the HTX buffer
- BUG/MINOR: http-client: Ignore 1XX interim responses in non-HTX mode
- BUG/MINOR: http-client: Reject any 101-switching-protocols response
- BUG/MEDIUM: http-client: Drain the request if an early response is received
- BUG/MEDIUM: http-client: Notify applet has more data to deliver until the EOM
- BUG/MINOR: h3: fix https scheme request encoding for BE side
- MINOR: h1-htx: Add function to format an HTX message in its H1 representation
- BUG/MINOR: mux-h1: Use configured error files if possible for early H1 errors
- BUG/MINOR: h1-htx: Don't forget to init flags in h1_format_htx_msg function
- CLEANUP: assorted typo fixes in the code, commits and doc
- BUILD: adjust scripts/build-ssl.sh to modern CMake system of QuicTLS
- MINOR: debug: add distro name and version in postmortem
Since 2012, systemd compliant distributions contain
/etc/os-release file. This file has some standardized format, see details at
https://www.freedesktop.org/software/systemd/man/latest/os-release.html.
Let's read it in feed_post_mortem_linux() to gather more info about the
distribution.
(cherry picked from commit f1594c41368baf8f60737b229e4359fa7e1289a9)
Signed-off-by: Willy Tarreau <w@1wt.eu>
The regression was introduced by commit 187ae28 ("MINOR: h1-htx: Add
function to format an HTX message in its H1 representation"). We must be
sure the flags variable must be initialized in h1_format_htx_msg() function.
This patch must be backported with the commit above.
The H1 multiplexer is able to produce some errors on its own to report early
errors, before the stream is created. In that case, the error files of the
proxy were tested to detect empty files (or /dev/null) but they were not
used to produce the error itself.
But the documentation states that configured error files are used in all
cases. And in fact, it is not really a problem to use these files. We must
just format a full HTX message. Thanks to the previous patch, it is now
possible.
This patch should fix the issue #3032. It should be backported to 3.2. For
older versions, it must be discussed but it should be quite easy to do.
The function h1_format_htx_msg() can now be used to convert a valid HTX
message in its H1 representation. No validity test is performed, the HTX
message must be valid. Only trailers are silently ignored if the message is
not chunked. In addition, the destination buffer must be empty. 1XX interim
responses should be supported. But again, there is no validity tests.
An HTTP/3 request must contains :scheme pseudo-header. Currently, only
"https" value is expected due to QUIC transport layer in use.
However, https value is incorrectly encoded due to a QPACK index value
mismatch in qpack_encode_scheme(). Fix it to ensure that scheme is now
properly set for HTTP/3 requests on the backend side.
No need to backport this.
When we leave the I/O handler with an unfinished request, we must report the
applet has more data to deliver. Otherwise, when the channel request buffer
is emptied, the http-client applet is not always woken up to forward the
remaining request data.
This issue was probably revealed by commit "BUG/MEDIUM: http-client: Don't
wake http-client applet if nothing was xferred". It is only an issue with
large POSTs, when the payload is streamed.
This patch must be backported as far as 2.6 with the commit above. But on
older versions, the applet API may differ. So be careful.
When a large request is sent, it is possible to have a response before the
end of the request. It is valid from HTTP perspective but it is an issue
with the current design of the http-client. Indded, the request and the
response are handled sequentially. So the response will be blocked, waiting
for the end of the request. Most of time, it is not an issue, except when
the request transfer is blocked. In that case, the applet is blocked.
With the current API, it is not possible to handle early response and
continue the request transfer. So, this case cannot be handle. In that case,
it seems reasonnable to drain the request if a response is received. This
way, the request transfer, from the caller point of view, is never blocked
and the response can be properly processed.
To do so, the action flag HTTPCLIENT_FA_DRAIN_REQ is added to the
http-client. When it is set, the request payload is just dropped. In that
case, we take care to not report the end of input to properly report the
request was truncated, especially in logs.
It is only an issue with large POSTs, when the payload is streamed.
This patch must be backported as far as 2.6.
Protocol updages are not supported by the http-client. So report an error is
a 101-switching-protocols response is received. Of course, it is unexpected
because the API is not designed to support upgrades. But it is better to
properly handle this case.
This patch could be backported as far as 2.6. It depends on the commit
"BUG/MINOR: http-client: Ignore 1XX interim responses in non-HTX mode".
When the response is re-formatted in raw message, the 1XX interim responses
must be skipped. Otherwise, information of the first interim response will
be saved (status line and headers) and those from the final response will be
dropped.
Note that for now, in HTX-mode, the interim messages are removed.
This patch must be backported as far as 2.6.
when htx_to_buf() function is called, if the HTX message is empty, the
buffer is reset. So HTX flags must not be tested after because the info may
be lost.
So now, we take care to test HTX_FL_EOM flag before calling htx_to_buf().
This patch must be backported as far as 2.8.
When the request payload cannot be xferred to the channel because its buffer
is full, we must request for more room by calling sc_need_room(). It is
important to be sure the httpclient applet will not be woken up in loop to
push more data while it is not possible.
It is only an issue with large POSTs, when the payload is streamed.
This patch must be backported as far as 2.6. Note that on 2.6,
sc_need_room() only takes one argument.
When HTX blocks from the requests are transferred into the channel buffer,
the return value of htx_xfer_blks() function must not be used to increment
the channel input value because meta data are counted here while they are
not part of input data. Because of this bug, it is possible to forward more
data than these present in the channel buffer.
Instead, we look at the input data before and after the transfer and the
difference is added.
It is only an issue with large POSTs, when the payload is streamed.
This patch must be backported as far as 2.6.
When data are transferred to or from the htt-pclient, the applet is
systematically woken up, even when no data are transferred. This could lead
to needlessly wakeups. When called from a lua script, if data are blocked
for a while, this leads to a wakeup ping-pong loop where the http-client
applet is woken up by the lua script which wakes back the script.
To fix the issue, in httpclient_req_xfer() and httpclient_res_xfer()
functions, we now take care to not wake the http-client applet up when no
data are transferred.
This patch must be backported as far as 2.6.
The aim of this patch is to identify the QUIC traces between the QUIC frontend
and backend parts. Two haproxy instances are created. The c(1|2) http clients
connect to ha1 with TCP frontends and QUIC backends. ha2 embeds two QUIC listeners
with s1 as TCP backend. When the traces are activated, they are dumped to stderr.
Hopefully, they are prefixed by the haproxy instance name (h1 or h2). This is very
useful to identify the QUIC instances.
Revert this patch which is no more useful since OpenSSL 3.5.1 to remove the
QUIC server callback restoration after SSL context switch:
MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset
It was required for 3.5.0. That said, there was no CI for OpenSSL 3.5 at the date
of this commit. The CI recently revealed that the QUIC server side could crash
during QUIC reg tests just after having restored the callbacks as implemented by
the commit above.
Also revert this commit which is no more useful because it arrived with the commit
above:
BUG/MEDIUM: quic: SSL/TCP handshake failures with OpenSSL 3.
Must be backported to 3.2.
The QUIC listener part was impacted by the 3.5.0 OpenSSL new QUIC API with several
issues which have been fixed by 3.5.1.
Add a #error to prevent such OpenSSL 3.5 new QUIC API use with version below 3.5.1.
Must be backported to 3.2.
A fix was made in 3.0 for the case where sharded listeners were using
a same ID with commit 0db8b6034d ("BUG/MINOR: listener: always assign
distinct IDs to shards"). However, the fix is incorrect. By checking the
ID of temporary node instead of the kept one in bind_complete_thread_setup()
it ends up never inserting the used nodes at this point, thus not reserving
them. The side effect is that assigning too close IDs to subsequent
listeners results in the same ID still being assigned twice since not
reserved. Example:
global
nbthread 20
frontend foo
bind :8000 shards by-thread id 10
bind :8010 shards by-thread id 20
The first one will start a series from 10 to 29 and the second one a
series from 20 to 39. But 20 not being inserted when creating the shards,
it will remain available for the post-parsing phase that assigns all
unassigned IDs by filling holes, and two listeners will have ID 20.
By checking the correct node, the problem disappears. The patch above
was marked for backporting to 2.6, so this fix should be backported that
far as well.
"HAVE_TCP_MD5SIG" feature is now registered if TCP MD5 signature is
supported. This will help the feature detection in the reg-test script
dedicated to this feature.
The hostdn.key field in the server contains a pure copy of the hostname_dn
since commit 3406766d57 ("MEDIUM: resolvers: add a ref between servers and
srv request or used SRV record") which wanted to lowercase it. Since it's
not necessary, let's drop this useless copy. In addition, the return from
strdup() was not tested, so it could theoretically crash the process under
heavy memory contention.
The server's "hostname_dn" is in Domain Name format, not a pure string, as
converted by resolv_str_to_dn_label(). It is made of lower-case string
components delimited by binary lengths, e.g. <0x03>www<0x07>haproxy<0x03)org.
As such it must not be lowercased again in srv_state_srv_update(), because
1) it's useless on the name components since already done, and 2) because
it would replace component lengths 97 and above by 32-char shorter ones.
Granted, not many domain names have that large components so the risk is
very low but the operation is always wrong anyway. This was brought in
2.5 by commit 3406766d57 ("MEDIUM: resolvers: add a ref between servers
and srv request or used SRV record").
In the same vein, let's fix the confusing strcasecmp() that are applied
to this binary format, and use memcmp() instead. Here there's basically
no risk to incorrectly match the wrong record, but that test alone is
confusing enough to provoke the existence of the bug above.
Finally let's update the component for that field to mention that it's
in this format and already lower cased.
Better not backport this, the risk of facing this bug is almost zero, and
every time we touch such files something breaks for bad reasons.
Skip the job for "latest" libssl version, when this version is the same
as a one already in the list.
This avoid having 2 jobs for OpenSSL 3.5.1 since no new dev version are
available for now and 3.5.1 is already in the list.
To properly support QUIC for dynamic servers, it is required to extend
add server CLI handler :
* ensure conformity between server address and proto
* automatically set proto to QUIC if not specified
* prepare_srv callback must be called to initialize required SSL context
Prior to this patch, crashes may occur when trying to use QUIC with
dynamic servers.
Also, destroy_srv callback must be called when a dynamic server is
deallocated. This ensures that there is no memory leak due to SSL
context.
No need to backport.
Add postparsing checks to control server line conformity regarding QUIC
both on the server address and the MUX protocol. An error is reported in
the following case :
* proto quic is explicitely specified but server address does not
specify quic4/quic6 prefix
* another proto is explicitely specified but server address uses
quic4/quic6 prefix
This patch skips the TLS version settings. They have as a side effect to add
all the TLS version extensions to the ClientHello message (TLS 1.0 to TLS 1.3).
QUIC supports only TLS 1.3.
First simple VTC file for QUIC reg tests. Two listeners are configured, one without
Retry enabled and the other without. Two clients simply tries to connect to these
listeners to make an basic H3 request.
Make the server line parsing fail when a QUIC backend is configured if haproxy
is built to use the OpenSSL stack compatibility module. This latter does not
support the QUIC client part.
This bug fix completes this patch which was not sufficient:
MINOR: quic-be: Allow sending 1200 bytes Initial datagrams
This patch could not allow the build of well formed Initial packets coalesced to
others (Handshake) packets. Indeed, the <padding> parameter passed to qc_build_pkt()
is deduced from a first value: <padding> value and must be set to 1 for
the last encryption level. As a client, the last encryption level is always
the Handshake encryption level. But <padding> was always set to 1 for a QUIC
client, leading the first Initial packet to be malformed because considered
as the second one into the same datagram.
So, this patch sets <padding> value passed to qc_build_pkt() to 1 only when there
is no last encryption level at all, to allow the build of Initial only packets
(not coalesced) or when it frames to send (coalesced packets).
No need to backport.
This bug impacts both QUIC backends and frontends with OpenSSL 3.5 as QUIC API.
The connections to a haproxy QUIC listener from a haproxy QUIC backend could not
work at all without HelloRetryRequest TLS messages emitted by the backend
asking the QUIC client to restart the handshake followed by TLS alerts:
conn. @(nil) OpenSSL error[0xa000098] read_state_machine: excessive message size
Furthermore, the Initial CRYPTO data sent by the client were big (about two 1252 bytes
packets) (ClientHello TLS message). After analyzing the packets a key_share extension
with <unknown> as value was long (more that 1Ko). This extension is in relation with
the groups but does not belong to the groups supported by QUIC.
That said such connections could work with ngtcp2 as backend built against the same
OSSL TLS stack API but with a HelloRetryRequest.
ngtcp2 always set the QUIC default cipher suites and group, for all the stacks it
supports as implemented by this patch.
So this patch configures both QUIC backend and frontend cipher suites and groups
calling SSL_CTX_set_ciphersuites() and SSL_CTX_set1_groups_list() with the correct
argument, except for SSL_CTX_set1_groups_list() which fails with QUIC TLS for
a unknown reason at this time.
The call to SSL_CTX_set_options() is useless from ssl_quic_initial_ctx() for the QUIC
clients. One relies on ssl_sock_prepare_srv_ssl_ctx() to set them for now on.
This patch is effective for all the supported stacks without impact for AWS-LC,
and QUIC TLS and fixes the connections for haproxy QUIC frontend and backends
when builts against OpenSSL 3.5 QUIC API).
A new define HAVE_OPENSSL_QUICTLS has been added to openssl-compat.h to distinguish
the QUIC TLS stack.
Must be backported to 3.2.
This bug arrived with this commit:
MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset
To make QUIC connection succeed with OpenSSL 3.5 API, a call to quic_ssl_set_tls_cbs()
was needed from several callback which call SSL_set_SSL_CTX(). This has as side effect
to set the QUIC callbacks used by the OpenSSL 3.5 API.
But quic_ssl_set_tls_cbs() was also called for TCP sessions leading the SSL stack
to run QUIC code, if the QUIC support is enabled.
To fix this, simply ignore the TCP connections inspecting the <ssl_qc_app_data_index>
index value which is NULL for such connections.
Must be backported to 3.2.
OpenSSL 3.5.0 introduced experimental support for QUIC. This change enables the use_quic option when a compatible version of OpenSSL is detected, allowing QUIC-based functionality to be leveraged where applicable. Feature remains disabled for earlier versions to ensure compatibility.
Patterns are allocated when loading maps/acls from a file or dynamically
via the CLI, and are released only from the CLI (e.g. "clear map xxx").
These ones do not use pools and are much harder to monitor, e.g. in case
a script adds many and forgets to clear them, etc.
Let's add a new pair of metrics "PatternsAdded" and "PatternsFreed" that
will report the number of added and freed patterns respectively. This
can allow to simply graph both. The difference between the two normally
represents the number of allocated patterns. If Added grows without
Freed following, it can indicate a faulty script that doesn't perform
the needed cleanup. The metrics are also made available to Prometheus
as patterns_added_total and patterns_freed_total respectively.
This introduce a change of behavior in the httpclient API. When
generating a request with a payload buffer, the size of the buffer
payload is known and does not need to be streamed in chunks.
This patch force to sends payload buffer using a Content-Length header
in the request, however the behavior does not change if a callback is
still used instead of a buffer.
When the "pause" action is parsed, if an expression is used instead of a
static value, the position of the current argument after the expression
evaluation is incremented while it should not. The sample_parse_expr()
function already take care of it. However, it should still be incremented
when an time value was parsed.
This patch must be backported to 3.2.
When the TCP MD5 signature is enabled, on a listening socket or an outgoing
one, the tcp_md5sig structure must be initialized first.
It is a 3.3-specific issue. No backport needed.
This patch adds the support for the RFC2385 (Protection of BGP Sessions via
the + TCP MD5 Signature Option) for the listeners and the servers. The
feature is only available on Linux. Keywords are not exposed otherwise.
By setting "tcp-md5sig <password>" option on a bind line, TCP segments of
all connections instantiated from the listening socket will be signed with a
16-byte MD5 digest. The same option can be set on a server line to protect
outgoing connections to the corresponding server.
The primary use case for this option is to allow BGP to protect itself
against the introduction of spoofed TCP segments into the connection
stream. But it can be useful for any very long-lived TCP connections.
A reg-test was added and it will be executed only on linux. All other
targets are excluded.
Since patch 20718f40b6 ("MEDIUM: ssl/ckch: add filename and linenum
argument to crt-store parsing"), the definition of ocsp_update_init()
and its declaration does not share the same arguments.
Must be backported to 3.2.
We are reusing DEVICEATLAS_INC/DEVICEATLAS_LIB when the DeviceAtlas
library had been compiled and installed with cmake and make install targets.
Works fine except when ldconfig is unaware of the path, thus adding
cflags/ldflags into the mix.
Ideally, to be backported down to the lowest stable branch.
TRACE_ENTER is crashing in ssl_sock_io_cb() in case a connection idle is
being stolen. Indeed the function could be called with a NULL context
and dereferencing it will crash.
This patch fixes the issue by initializing ctx only once it is usable,
and moving TRACE_ENTER after the initialization.
This must be backported to 3.2.
Commit 41f28b3c53 ("DEV: phash: Update 414 and 431 status codes to phash")
accidentally committed a.out, resulting in build/checkout issues when
locally rebuilt. Let's drop it.
This should be backported to 3.1.
Use the new HTTPCLIENT_O_RES_HTX flag when using the CLI httpclient with
--htx.
It allows to process directly the response in HTX, then the htx_dump()
function is used to display a debug output.
Example:
echo "httpclient --htx GET https://haproxy.org" | socat /tmp/haproxy.sock
htx=0x79fd72a2e200(size=16336,data=139,used=6,wrap=NO,flags=0x00000010,extra=0,first=0,head=0,tail=5,tail_addr=139,head_addr=0,end_addr=0)
[0] type=HTX_BLK_RES_SL - size=31 - addr=0 HTTP/2.0 301
[1] type=HTX_BLK_HDR - size=15 - addr=31 content-length: 0
[2] type=HTX_BLK_HDR - size=32 - addr=46 location: https://www.haproxy.org/
[3] type=HTX_BLK_HDR - size=25 - addr=78 alt-svc: h3=":443"; ma=3600
[4] type=HTX_BLK_HDR - size=35 - addr=103 set-cookie: served=2:TLSv1.3+TCP:IPv4
[5] type=HTX_BLK_EOH - size=1 - addr=138 <empty>
Add a HTTPCLIENT_O_RES_HTX flag which allow to store directly the HTX
data in the response buffer instead of extracting the data in raw
format.
This is useful when the data need to be reused in another request.
This patch split the httpclient code to prevent confusion between the
httpclient CLI command and the actual httpclient API.
Indeed there was a confusion between the flag used internally by the
CLI command, and the actual httpclient API.
hc_cli_* functions as well as HC_C_F_* defines were moved to
httpclient_cli.c.
The ocsp-update uses the flags from the httpclient CLI, which are not
supposed to be used elsewhere since this is a state for the CLI.
This patch implements HC_OCSP flags for the ocsp-update.
The HC_F_HTTPPROXY flag was wrongly named and does not use the correct
value, indeed this flag was meant to be used for the httpclient API, not
the httpclient CLI.
This patch fixes the problem by introducing HTTPCLIENT_FO_HTTPPROXY
which has must be set in hc->flags.
Also add a member 'options' in the httpclient structure, because the
member flags is reinitialized when starting.
Must be backported as far as 3.0.
Add a fourth character to the second column of the "typed output format"
to indicate whether the value results from a volatile or persistent metric
('V' or 'P' characters respectively). A persistent metric means the value
could possibily be preserved across reloads by leveraging a shared memory
between multiple co-processes. Such metrics are identified as "shared" in
the code (since they are possibly shared between multiple co-processes)
Some reg-tests were updated to take that change into account, also, some
outputs in the configuration manual were updated to reflect current
behavior.
In this patch we introduce a new helped function called `smp_client_hello_parse()` to extract
information presented in a TLS client hello handshake message. 7 sample fetches have also been
modified to use this helped function to do the common client hello parsing and use the result
to do further processing of extensions/cipher.
Fixes: #2532
When threads are enabled and running on a machine with multiple CCX
or multiple nodes, thread groups are now enabled since 3.3-dev2, causing
load-balancing algorithms to randomly fail due to incoming connections
spreading over multiple groups and using different load balancing indexes.
Let's just force "thread-groups 1" into all configs when threads are
enabled to avoid this.
Using certificates in the jwt_verify converter allows to make use of the
CLI certificate updates, which is still impossible with public keys (the
legacy behavior).
The jwt_verify can now take public certificates as second parameter,
either with actual certificate path (no previously mentioned) or from a
predefined crt-store or from a variable.
A ckch_store used in JWT verification might not have any ckch instances
or crt-list entries linked but we don't want to be able to remove it via
the CLI anyway since it would make all future jwt_verify calls using
this certificate fail.
The ckch_stores might be used to store public certificates only so in
this case we won't provide private keys when updating the certificate
via the CLI.
If the ckch_store is actually used in a bind or server line an error
will still be raised if the private key is missing.
The 'jwt_verify' converter could only be passed public keys as second
parameter instead of full-on public certificates. This patch allows
proper certificates to be used.
Those certificates can be loaded in ckch_stores like any other
certificate which means that all the certificate-related operations that
can be made via the CLI can now benefit JWT validation as well.
We now have two ways JWT validation can work, the legacy one which only
relies on public keys which could not be stored in ckch_stores without
some in depth changes in the way the ckch_stores are built. In this
legacy way, the public keys are fully stored in a cache dedicated to JWT
only which does not have any CLI commands and any way to update them
during runtime. It also requires that all the public keys used are
passed at least once explicitely to the 'jwt_verify' converter so that
they can be loaded during init.
The new way uses actual certificates, either already stored in the
ckch_store tree (if predefined in a crt-store or already used previously
in the configuration) or loaded in the ckch_store tree during init if
they are explicitely used in the configuration like so:
var(txn.bearer),jwt_verify(txn.jwt_alg,"cert.pem")
When using a variable (or any other way that can only be resolved during
runtime) in place of the converter's <key> parameter, the first time we
encounter a new value (for which we don't have any entry in the jwt
tree) we will lock the ckch_store tree and try to perform a lookup in
it. If the lookup fails, an entry will still be inserted into the jwt
tree so that any following call with this value avoids performing the
ckch_store tree lookup.
Contrary to what the doc says, the jwt_verify converter only works with
a public key and not a full certificate for certificate based protocols
(everything but HMAC).
This patch should be backported up to 2.8.
When resolving variable values the temporary trash chunks are used so
when calling the 'jwt_verify' converter with two variable parameters
like in the following line, the input would be overwritten by the value
of the second parameter :
var(txn.bearer),jwt_verify(txn.jwt_alg,txn.cert)
Copying the values into dedicated alloc'ed buffers prevents any new call
to get_trash_chunk from erasing the data we need in the converter.
This patch can be backported up to 2.8.
On backend side, an error at connection level during the preface sending was
not properly handled and could lead to a spinning loop on process_stream()
when the h2 stream on client side was blocked, for instance because of h2
flow control.
It appeared that no transition was perfromed from the PREFACE state to an
ERROR state on the H2 connection when an error occurred on the underlying
connection. In that case, the H2 connection was woken up in loop to try to
receive data, waking up the upper stream at the same time.
To fix the issue, an H2C error must be reported. Most state transitions are
handled by the demux function. So it is the right place to do so. First, in
PREFACE state and on server side, if an error occurred on the TCP
connection, an error is now reported on the H2 connection. REFUSED_STREAM
error code is used in that case. In addition, in that case, we also take
care to properly handle the connection shutdown.
This patch should fix the issue #3020. It must be backported to all stable
versions.
It was already forbidden to use HTTP sample fetch functions from lua
services. An error is triggered if it happens. However, the error must be
extended to any L6/L7 sample fetch functions.
Indeed, a lua service is an applet. It totally unexepected for an applet to
access to input data in a channel's buffer. These data have not been
analyzed yet and are still subject to any change. An applet, lua or not,
must never access to "not forwarded" data. Only output data are
available. For now, if a lua applet relies on any L6/L7 sampel fetch
functions, the behavior is undefined and not consistent.
So to fix the issue, hlua flag HLUA_F_MAY_USE_HTTP is renamed to
HLUA_F_MAY_USE_CHANNELS_DATA. This flag is used to prevent any lua applet to
use L6/L7 sample fetch functions.
This patch could be backported to all stable versions.
Patch ed9b8fec49 ("BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in
RSA+ECDSA configuration") partly fixed a cipher selection problem with
AWS-LC. However this was not checking anymore if the ciphersuites was
available in haproxy which is still a problem.
The problem was fixed in AWS-LC 1.46.0 with this PR
https://github.com/aws/aws-lc/pull/2092.
This patch allows to filter again the TLS13 ciphersuites with recent
versions of AWS-LC. However, since there are no macros to check the
AWS-LC version, it is enabled at the next AWS-LC API version change
following the fix in AWS-LC v1.50.0.
This could be backported where ed9b8fec49 was backported.
Since proxy and server struct already have an internal last_change
variable and we cannot merge it with the shared counter one, let's
rename the last_change counter to be more specific and prevent the
mixup between the two.
last_change counter is renamed to last_state_change, and unlike the
internal last_change, this one is a shared counter so it is expected
to be updated by other processes in our back.
However, when updating last_state_change counter, we use the value
of the server/proxy last_change as reference value.
Same motivation as previous commit, proxy last_change is "abused" because
it is used for 2 different purposes, one for stats, and the other one
for process-local internal use.
Let's add a separate proxy-only last_change variable for internal use,
and leave the last_change shared (and thread-grouped) counter for
statistics.
last_change server metric is used for 2 separate purposes. First it is
used to report last server state change date for stats and other related
metrics. But it is also used internally, including in sensitive paths,
such as lb related stuff to take decision or perform computations
(ie: in srv_dynamic_maxconn()).
Due to last_change counter now being split over thread groups since 16eb0fa
("MAJOR: counters: dispatch counters over thread groups"), reading the
aggregated value has a cost, and we cannot afford to consult last_change
value from srv_dynamic_maxconn() anymore. Moreover, since the value is
used to take decision for the current process we don't wan't the variable
to be updated by another process in our back.
To prevent performance regression and sharing issues, let's instead add a
separate srv->last_change value, which is not updated atomically (given how
rare the updates are), and only serves for places where the use of the
aggregated last_change counter/stats (split over thread groups) is too
costly.
16eb0fa ("MAJOR: counters: dispatch counters over thread groups")
introduced some bugs: as a result of improper copy paste during
COUNTERS_SHARED_LAST() macro introduction, some functions such as
srv_downtime() which used to make use of the server last_change variable
now use the proxy one, which doesn't make sense and will likely cause
unexpected logical errors/bugs.
Let's fix them all at once by properly pointing to the server last_change
variable when relevant.
No backport needed.
Now that native mailers configuration is only usable with Lua mailers,
Willy noticed that we lack a way to warn the user if mailers were
previously configured on an older version but Lua mailers were not loaded,
which could trick the user into thinking mailers keep working when
transitionning to 3.2 while it is not.
In this patch we add the 'core.use_native_mailers_config()' Lua function
which should be called in Lua script body before making use of
'Proxy:get_mailers()' function to retrieve legacy mailers configuration
from haproxy main config. This way haproxy effectively knows that the
native mailers config is actually being used from Lua (which indicates
user correctly migrated from native mailers to Lua mailers), else if
mailers are configured but not used from Lua then haproxy warns the user
about the fact that they will be ignored unless they are used from Lua.
(e.g.: using the provided 'examples/lua/mailers.lua' to ease transition)
_srv_check_proxy_mode() is currently executed during server init (from
_srv_parse_init()), while it used to be fine for current checks, it
seems it occurs a bit too early to be usable for some checks that depend
on server keywords to be evaluated for instance.
As such, to make _srv_check_proxy_mode() more relevant and be extended
with additional checks in the future, let's call it later during server
finalization, once all server keywords were evaluated.
No change of behavior is expected
This commit broke the QUIC backend connection to servers without address validation
or retry activated:
MINOR: quic-be: address validation support implementation (RETRY)
Indeed the retry_source_connection_id transport parameter was already checked as
as if it was required, as if the peer (server) was always using the address validation.
Furthermore, relying on ->odcid.len to ensure a retry token was received is not
correct.
This patch ensures the retry_source_connection_id transport parameter is checked
only when a retry token was received (->retry_token != NULL). In this case
it also checks that this transport parameter is present when a retry token
has been received (tx_params->retry_source_connection_id.len != 0).
No need to backport.
Released version 3.3-dev2 with the following main changes :
- BUG/MINOR: config/server: reject QUIC addresses
- MINOR: server: implement helper to identify QUIC servers
- MINOR: server: mark QUIC support as experimental
- MINOR: mux-quic-be: allow QUIC proto on backend side
- MINOR: quic-be: Correct Version Information transp. param encoding
- MINOR: quic-be: Version Information transport parameter check
- MINOR: quic-be: Call ->prepare_srv() callback at parsing time
- MINOR: quic-be: QUIC backend XPRT and transport parameters init during parsing
- MINOR: quic-be: QUIC server xprt already set when preparing their CTXs
- MINOR: quic-be: Add a function for the TLS context allocations
- MINOR: quic-be: Correct the QUIC protocol lookup
- MINOR: quic-be: ssl_sock contexts allocation and misc adaptations
- MINOR: quic-be: SSL sessions initializations
- MINOR: quic-be: Add a function to initialize the QUIC client transport parameters
- MINOR: sock: Add protocol and socket types parameters to sock_create_server_socket()
- MINOR: quic-be: ->connect() protocol callback adaptations
- MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn())
- MINOR: quic-be: xprt ->init() adapatations
- MINOR: quic-be: add field for max_udp_payload_size into quic_conn
- MINOR: quic-be: Do not redispatch the datagrams
- MINOR: quic-be: Datagrams and packet parsing support
- MINOR: quic-be: Handshake packet number space discarding
- MINOR: h3-be: Correctly retrieve h3 counters
- MINOR: quic-be: Store asap the DCID
- MINOR: quic-be: Build post handshake frames
- MINOR: quic-be: Add the conn object to the server SSL context
- MINOR: quic-be: Initial packet number space discarding.
- MINOR: quic-be: I/O handler switch adaptation
- MINOR: quic-be: Store the remote transport parameters asap
- MINOR: quic-be: Missing callbacks initializations (USE_QUIC_OPENSSL_COMPAT)
- MINOR: quic-be: Make the secret derivation works for QUIC backends (USE_QUIC_OPENSSL_COMPAT)
- MINOR: quic-be: SSL_get_peer_quic_transport_params() not defined by OpenSSL 3.5 QUIC API
- MINOR: quic-be: get rid of ->li quic_conn member
- MINOR: quic-be: Prevent the MUX to send/receive data
- MINOR: quic: define proper proto on QUIC servers
- MEDIUM: quic-be: initialize MUX on handshake completion
- BUG/MINOR: hlua: Don't forget the return statement after a hlua_yieldk()
- BUILD: hlua: Fix warnings about uninitialized variables
- BUILD: listener: fix 'for' loop inline variable declaration
- BUILD: hlua: Fix warnings about uninitialized variables (2)
- BUG/MEDIUM: mux-quic: adjust wakeup behavior
- MEDIUM: backend: delay MUX init with ALPN even if proto is forced
- MINOR: quic: mark ctrl layer as ready on quic_connect_server()
- MINOR: mux-quic: improve documentation for snd/rcv app-ops
- MINOR: mux-quic: define flag for backend side
- MINOR: mux-quic: set expect data only on frontend side
- MINOR: mux-quic: instantiate first stream on backend side
- MINOR: quic: wakeup backend MUX on handshake completed
- MINOR: hq-interop: decode response into HTX for backend side support
- MINOR: hq-interop: encode request from HTX for backend side support
- CLEANUP: quic-be: Add comments about qc_new_conn() usage
- BUG/MINOR: quic-be: CID double free upon qc_new_conn() failures
- MINOR: quic-be: Avoid SSL context unreachable code without USE_QUIC_OPENSSL_COMPAT
- BUG/MINOR: quic: prevent crash on startup with -dt
- MINOR: server: reject QUIC servers without explicit SSL
- BUG/MINOR: quic: work around NEW_TOKEN parsing error on backend side
- BUG/MINOR: http-ana: Properly handle keep-query redirect option if no QS
- BUG/MINOR: quic: don't restrict reception on backend privileged ports
- MINOR: hq-interop: handle HTX response forward if not enough space
- BUG/MINOR: quic: Fix OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn callback (OpenSSL3.5)
- BUG/MINOR: quic: fix ODCID initialization on frontend side
- BUG/MEDIUM: cli: Don't consume data if outbuf is full or not available
- MINOR: cli: handle EOS/ERROR first
- BUG/MEDIUM: check: Set SOCKERR by default when a connection error is reported
- BUG/MINOR: mux-quic: check sc_attach_mux return value
- MINOR: h3: support basic HTX start-line conversion into HTTP/3 request
- MINOR: h3: encode request headers
- MINOR: h3: complete HTTP/3 request method encoding
- MINOR: h3: complete HTTP/3 request scheme encoding
- MINOR: h3: adjust path request encoding
- MINOR: h3: adjust auth request encoding or fallback to host
- MINOR: h3: prepare support for response parsing
- MINOR: h3: convert HTTP/3 response into HTX for backend side support
- MINOR: h3: complete response status transcoding
- MINOR: h3: transcode H3 response headers into HTX blocks
- MINOR: h3: use BUG_ON() on missing request start-line
- MINOR: h3: reject invalid :status in response
- DOC: config: prefer-last-server: add notes for non-deterministic algorithms
- CLEANUP: connection: remove unused mux-ops dedicated to QUIC
- BUG/MINOR: mux-quic/h3: properly handle too low peer fctl initial stream
- MINOR: mux-quic: support max bidi streams value set by the peer
- MINOR: mux-quic: abort conn if cannot create stream due to fctl
- MEDIUM: mux-quic: implement attach for new streams on backend side
- BUG/MAJOR: fwlc: Count an avoided server as unusable.
- MINOR: fwlc: Factorize code.
- BUG/MEDIUM: quic: do not release BE quic-conn prior to upper conn
- MAJOR: cfgparse: turn the same proxy name warning to an error
- MAJOR: cfgparse: make sure server names are unique within a backend
- BUG/MINOR: tools: only reset argument start upon new argument
- BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself
- BUG/MINOR: log: Be able to use %ID alias at anytime of the stream's evaluation
- MINOR: hlua: emit a log instead of an alert for aborted actions due to unavailable yield
- MAJOR: mailers: remove native mailers support
- BUG/MEDIUM: ssl/clienthello: ECDSA with ssl-max-ver TLSv1.2 and no ECDSA ciphers
- DOC: configuration: add details on prefer-client-ciphers
- MINOR: ssl: Add "renegotiate" server option
- DOC: remove the program section from the documentation
- MAJOR: mworker: remove program section support
- BUG/MINOR: quic: wrong QUIC_FT_CONNECTION_CLOSE(0x1c) frame encoding
- MINOR: quic-be: add a "CC connection" backend TX buffer pool
- MINOR: quic: Useless TX buffer size reduction in closing state
- MINOR: quic-be: Allow sending 1200 bytes Initial datagrams
- MINOR: quic-be: address validation support implementation (RETRY)
- MEDIUM: proxy: deprecate the "transparent" and "option transparent" directives
- REGTESTS: update http_reuse_be_transparent with "transparent" deprecated
- REGTESTS: script: also add a line pointing to the log file
- DOC: config: explain how to deal with "transparent" deprecation
- MEDIUM: proxy: mark the "dispatch" directive as deprecated
- DOC: config: crt-list clarify default cert + cert-bundle
- MEDIUM: cpu-topo: switch to the "performance" cpu-policy by default
- SCRIPTS: drop the HTML generation from announce-release
- BUG/MINOR: tools: use my_unsetenv instead of unsetenv
- CLEANUP: startup: move comment about nbthread where it's more appropriate
- BUILD: qpack: fix a build issue on older compilers
Got this on gcc-4.8:
src/qpack-enc.c: In function 'qpack_encode_method':
src/qpack-enc.c:168:3: error: 'for' loop initial declarations are only allowed in C99 mode
for (size_t i = 0; i < istlen(other); ++i)
^
This came from commit a0912cf914 ("MINOR: h3: complete HTTP/3 request
method encoding"), no backport is needed.
Let's use our own implementation of unsetenv() instead of the one, which is
provided in libc. Implementation from libc may vary in dependency of UNIX
distro. Implemenation from libc.so.1 ported on Illumos (see the link below) has
caused an eternal loop in the clean_env(), where we invoke unsetenv().
(https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/gen/getenv.c#L411C1-L456C1)
This is reported at GitHUB #3018 and the reporter has proposed the patch, which
we really appreciate! But looking at his fix and to the implementations of
unsetenv() in FreeBSD libc and in Linux glibc 2.31, it seems, that the algorithm
of clean_env() will perform better with our my_unsetenv() implementation.
This should be backported in versions 3.1 and 3.2.
It has not been used over the last 5 years or so and systematically
requires manual removal. Let's just stop producing it. Also take
this opportunity to add the missing link to /discussions.
As mentioned during the NUMA series development, the goal is to use
all available cores in the most efficient way by default, which
normally corresponds to "cpu-policy performance". The previous default
choice of "cpu-policy first-usable-node" was only meant to stay 100%
identical to before cpu-policy.
So let's switch the default cpu-policy to "performance" right now.
The doc was updated to reflect this.
Clarify that HAProxy duplicates crt-list entries for multi-cert bundles
which can create unexpected side-effects as only the very first
certificate after duplication is considered as default implicitly.
As mentioned in [1], the "dispatch" directive from haproxy 1.0 has long
outlived its original purpose and still suffers from a number of technical
limitations (no checks, no SSL, no idle connes etc) and still hinders some
internal evolutions. It's now time to mark it as deprecated, and to remove
it in 3.5 [2]. It was already recommended against in the documentation but
remained popular in raw TCP environments for being shorter to write.
The directive will now cause a warning to be emitted, suggesting an
alternate method involving "server". The warning can be shut using
"expose-deprecated-directives". The rare configs from 1.0 where
"dispatch" is combined with sticky servers using cookies will just
need to set these servers's weights to zero to prevent them from
being selected by the load balancing algorithm. All of this is
explained in the doc with examples.
Two reg tests were using this method, one purposely for this directive,
which now has expose-deprecated-directives, and another one to test the
behavior of idle connections, which was updated to use "server" and
extended to test both "http-reuse never" and "http-reuse always".
[1] https://github.com/orgs/haproxy/discussions/2921
[2] https://github.com/haproxy/wiki/wiki/Breaking-changes
The explanations for the "option transparent" keyword were a bit scarce
regarding deprecation, so let's explain how to replace it with a server
line that does the same.
I never counted the number of hours I've been spending selecting then
copy-pasting the directory output and manually appending "/LOG" to read
a log file but it amounts in tens to hundreds. Let's just add a direct
pointer to the log file at the end of the log for a failed run.
With commit e93f3ea3f8 ("MEDIUM: proxy: deprecate the "transparent" and
"option transparent" directives") this one no longer works as the config
either has to be adjusted to use server 0.0.0.0 or to enable the deprecated
feature. The test used to validate a technical limitation ("transparent"
not supporting shared connections), indicated as being comparable to
"http-reuse never". Let's now duplicate the test for "http-reuse never"
and "http-reuse always" and validate both behaviors.
Take this opportunity to fix a few problems in this config:
- use "nbthread 1": depending on the thread where the connection
arrives, the connection may or may not be reused
- add explicit URLs to the clients so that they can be recognized
in the logs
- add comments to make it clearer what to expect for each test
As discussed here [1], "transparent" (already deprecated) and
"option transparent" are horrible hacks which should really disappear
in favor of "server xxx 0.0.0.0" which doesn't rely on hackish code
path. This old feature is now deprecated in 3.3 and will disappear in
3.5, as indicated here [2]. A warning is emitted when used, explaining
how to proceed, and how to silence the warning using the global
"expose-deprecated-directives" if needed. The doc was updated to
reflect this new state.
[1] https://github.com/orgs/haproxy/discussions/2921
[2] https://github.com/haproxy/wiki/wiki/Breaking-changes
- Add ->retry_token and ->retry_token_len new quic_conn struct members to store
the retry tokens. These objects are allocated by quic_rx_packet_parse() and
released by quic_conn_release().
- Add <pool_head_quic_retry_token> new pool for these tokens.
- Implement quic_retry_packet_check() to check the integrity tag of these tokens
upon RETRY packets receipt. quic_tls_generate_retry_integrity_tag() is called
by this new function. It has been modified to pass the address where the tag
must be generated
- Add <resend> new parameter to quic_pktns_discard(). This function is called
to discard the packet number spaces where the already TX packets and frames are
attached to. <resend> allows the caller to prevent this function to release
the in flight TX packets/frames. The frames are requeued to be resent.
- Modify quic_rx_pkt_parse() to handle the RETRY packets. What must be done upon
such packets receipt is:
- store the retry token,
- store the new peer SCID as the DCID of the connection. Note that the peer will
modify again its SCID. This is why this SCID is also stored as the ODCID
which must be matched with the peer retry_source_connection_id transport parameter,
- discard the Initial packet number space without flagging it as discarded and
prevent retransmissions calling qc_set_timer(),
- modify the TLS cryptographic cipher contexts (RX/TX),
- wakeup the I/O handler to send new Initial packets asap.
- Modify quic_transport_param_decode() to handle the retry_source_connection_id
transport parameter as a QUIC client. Then its caller is modified to
check this transport parameter matches with the SCID sent by the peer with
the RETRY packet.
This easy to understand patch is not intrusive at all and cannot break the QUIC
listeners.
The QUIC client MUST always pad its datagrams with Initial packets. A "!l" (not
a listener) test OR'ed with the existing ones is added to satisfy the condition
to allow the build of such datagrams.
There is no need to limit the size of the TX buffer to QUIC_MIN_CC_PKTSIZE bytes
when the connection is in closing state. There is already a test which limits the
number of bytes to be used from this TX buffer after this useless test removed.
It limits this number of bytes to the size of the TX buffer itself:
if (end > (unsigned char *)b_wrap(buf))
end = (unsigned char *)b_wrap(buf);
This is exactly what is needed when the connection is in closing state. Indeed,
the size of the TX buffers are limited to reduce the memory usage. The connection
only needs to send short datagrams with at most 2 packets with a CONNECTION_CLOSE*
frames. They are built only one time and backed up into small TX buffer allocated
from a dedicated pool.
The size of this TX buffer is QUIC_MAX_CC_BUFSIZE which depends on QUIC_MIN_CC_PKTSIZE:
#define QUIC_MIN_CC_PKTSIZE 128
#define QUIC_MAX_CC_BUFSIZE (2 * (QUIC_MIN_CC_PKTSIZE + QUIC_DGRAM_HEADLEN))
This size is smaller than an MTU.
This patch should be backported as far as 2.9 to ease further backports to come.
A QUIC client must be able to close a connection sending Initial packets. But
QUIC client Initial packets must always be at least 1200 bytes long. To reduce
the memory use of TX buffers of a connection when in "closing" state, a pool
was dedicated for this purpose but with a too much reduced TX buffer size
(QUIC_MAX_CC_BUFSIZE).
This patch adds a "closing state connection" TX buffer pool with the same role
for QUIC backends.
This is an old bug which was there since this commit:
MINOR: quic: Avoid zeroing frame structures
It seems QUIC_FT_CONNECTION_CLOSE was confused with QUIC_FT_CONNECTION_CLOSE_APP
which does not include a "frame type" field. This field was not initialized
(so with a random value) which prevent the packet to be built because the
packet builder supposes the packet with such frames are very short.
Must be backported as far as 2.6.
This patch removes completely the support for the program section, the
parsing of the section as well as the internals in the mworker does not
support it anymore.
The program section was considered dysfonctional and not fully
compatible with the "mworker V3" model. Users that want to run an
external program must use their init system.
The documentation is cleaned up in another patch.
This "renegotiate" option can be set on SSL backends to allow secure
renegotiation. It is mostly useful with SSL libraries that disable
secure regotiation by default (such as AWS-LC).
The "no-renegotiate" one can be used the other way around, to disable
secure renegotation that could be allowed by default.
Those two options can be set via "ssl-default-server-options" as well.
prefer-client-ciphers does not work exactly the same way when used with
a dual algorithm stack (ECDSA + RSA). This patch details its behavior.
This patch must be backported in every maintained version.
Problem was discovered in #2988.
Patch 23093c72 ("BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3
and dual ECDSA/RSA") introduced a problem when prioritizing the ECDSA
with TLSv1.3.
Indeed, when a client with TLSv1.3 capabilities announce a list of
ECDSA sigalgs, a list of TLSv1.3 ciphersuites compatible with ECDSA,
but only RSA ciphers for TLSv1.2, and haproxy is configured to a
ssl-max-ver TLSv1.2, then haproxy would use the ECDSA keypair, but the
client wouldn't be able to process it because TLSv1.2 was negociated.
HAProxy would be configured like that:
ssl-default-bind-options ssl-max-ver TLSv1.2
And a client could be used this way:
openssl s_client -connect localhost:8443 -cipher ECDHE-ECDSA-AES128-GCM-SHA256 \
-ciphersuites TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256
This patch fixes the issue by checking if TLSv1.3 was configured before
allowing ECDSA is an TLSv1.3 ciphersuite is in the list.
This could be backported where 23093c72 ("BUG/MINOR: ssl: suboptimal
certificate selection with TLSv1.3 and dual ECDSA/RSA") was backported.
However this is quite sensible and we should wait a bit before the
backport.
This should fix issue #2988
As mentioned in 2.8 announce on the mailing list [1] and on the wiki [2]
native mailers were deprecated and planned for removal in 3.3. Now is
the time to drop the legacy code for native mailers which is based on a
tcpcheck "hack" and cannot be maintained. Lua mailers should be used as
a drop in replacement. Indeed, "mailers" and associated config directives
are preserved because mailers config is exposed to Lua, which helps smoothing
the transition from native mailers to Lua based ones.
As a reminder, to keep mailers configuration working as before without
making changes to the config file, simply add the line below to the global
section:
lua-load examples/lua/mailers.lua
mailers.lua script (provided in the git repository, adjust path as needed)
may be customized by users familiar with Lua, by default it emulates the
behavior of the native (now removed) mailers.
[1]: https://www.mail-archive.com/haproxy@formilux.org/msg43600.html
[2]: https://github.com/haproxy/wiki/wiki/Breaking-changes
As reported by Chris Staite in GH #3002, trying to yield from a Lua
action during a client disconnect causes the script to be interrupted
(which is expected) and an alert to be emitted with the error:
"Lua function '%s': yield not allowed".
While this error is well suited for cases where the yield is not expected
at all (ie: when context doesn't allow it) and results from a yield misuse
in the Lua script, it isn't the case when the yield is exceptionnally not
available due to an abort or error in the request/response processing.
Because of that we raise an alert but the user cannot do anything about it
(the script is correct), so it is confusing and polluting the logs.
In this patch we introduce the ACT_OPT_FINAL_EARLY flag which is a
complementary flag to ACT_OPT_FIRST. This flag is set when the
ACT_OPT_FIRST is set earlier than normal (due to error/abort).
hlua_action() then checks for this flag to decide whether an error (alert)
or a simple log message should be emitted when the yield is not available.
It should solve GH #3002. Thanks to Chris Staite (@chrisstaite-menlo) for
having reported the issue and suggested a solution.
In a log-format string, using "%[unique-id]" or "%ID" should be equivalent.
However, for the first one, the unique ID is generated when the sample fetch
function is called. For the alias, it is not true. It that case, the
stream's unique ID is generated when the log message is emitted. Otherwise,
by default, the unique id is automatically generated at the end of the HTTP
request analysis.
So, if the alias "%ID" is use in a log-format string anywhere before the end
of the request analysis, the evaluation failed and the ID is considered as
empty. It is not consistent and in contradiction with the "%ID"
documentation.
To fix the issue, instead of evaluating the unique ID when the log message
is emitted, it is now performed on demand when "%ID" format is evaluated.
This patch should fix the issue #3016. It should be backported to all stable
versions. It relies on the following commit:
* BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself
There is nothing that prevent a "unique-id-format" to reference itself,
using '%ID' or '%[unique-id]'. If the sample fetch function is used, it
leads to an infinite loop, calling recursively the function responsible to
generate the unique ID.
One solution is to detect it during the configuration parsing to trigger an
error. With this patch, we just inhibit recursive calls by considering the
unique-id as empty during its evaluation. So "id-%[unique-id]" lf string
will be evaluated as "id-".
This patch must be backported to all stable versions.
In issue #2995, Thomas Kjaer reported that empty argument position
reporting had been broken yet again. This time it was broken by this
latest fix: 2b60e54fb1 ("BUG/MINOR: tools: improve parse_line()'s
robustness against empty args"). It turns out that this fix is not
the culprit and it's in fact correct. The culprit was the original
commit of this series, 7e4a2f39ef ("BUG/MINOR: tools: do not create
an empty arg from trailing spaces"), which used to reset arg_start
to outpos for every new char in addition to doing it for every arg.
This resulted in the end of the line to be seen as always being in
error, thus reporting an incorrect position that the caller would
correct in a generic way designating the beginning of the line. It
didn't reveal prior to the upper fix above because the misassigned
value was almost not used by then.
Assigning the value before entering the loop fixes this problem and
doens't break the series of previous oss-fuzz reproducers. Hopefully
it's the last one again.
This must be backported to 3.2. Thanks to @tkjaer for reporting the
issue along with a reproducer.
There was already a check for this but there used to be an exception
that allowed duplicate server names only in case where their IDs were
explicit and different. This has been emitting a warning since 3.1 and
planned for removal in 3.3, so let's do it now. The doc was updated,
though it never mentioned this unicity constraint, so that was added.
Only the check for the exception was removed, the rest of the code
that is currently made to deal with duplicate server names was not
cleaned yet (e.g. the tree doesn't need to support dups anymore, and
this could be done at insertion time). This may be a subject for future
cleanups.
As warned since 3.1, it's no longer permitted to have a frontend and
a backend under the same name. This causes too many designation issues,
and causes trouble with stick-tables as well. Now each proxy name is
unique.
This commit only changes the check to return an error. Some code parts
currently exist to find the best candidates, these will be able to be
simplified as future cleanup patches. The doc was updated.
For frontend side, quic_conn is only released if MUX wasn't allocated,
either due to handshake abort, in which case upper layer is never
allocated, or after transfer completion when full conn + MUX layers are
already released.
On the backend side, initialization is not performed in the same order.
Indeed, in this case, connection is first instantiated, the nthe
quic_conn is created to execute the handshake, while MUX is still only
allocated on handshake completion. As such, it is not possible anymore
to free immediately quic_conn on handshake failure. Else, this can cause
crash if the connection try to reaccess to its transport layer after
quic_conn release.
Such crash can easily be reproduced in case of connection error to the
QUIC server. Here is an example of an experienced backtrace.
Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault.
0x0000555555739733 in quic_close (conn=0x55555734c0d0, xprt_ctx=0x5555573a6e50) at src/xprt_quic.c:28
28 qc->conn = NULL;
[ ## gdb ## ] bt
#0 0x0000555555739733 in quic_close (conn=0x55555734c0d0, xprt_ctx=0x5555573a6e50) at src/xprt_quic.c:28
#1 0x00005555559c9708 in conn_xprt_close (conn=0x55555734c0d0) at include/haproxy/connection.h:162
#2 0x00005555559c97d2 in conn_full_close (conn=0x55555734c0d0) at include/haproxy/connection.h:206
#3 0x00005555559d01a9 in sc_detach_endp (scp=0x7fffffffd648) at src/stconn.c:451
#4 0x00005555559d05b9 in sc_reset_endp (sc=0x55555734bf00) at src/stconn.c:533
#5 0x000055555598281d in back_handle_st_cer (s=0x55555734adb0) at src/backend.c:2754
#6 0x000055555588158a in process_stream (t=0x55555734be10, context=0x55555734adb0, state=516) at src/stream.c:1907
#7 0x0000555555dc31d9 in run_tasks_from_lists (budgets=0x7fffffffdb30) at src/task.c:655
#8 0x0000555555dc3dd3 in process_runnable_tasks () at src/task.c:889
#9 0x0000555555a1daae in run_poll_loop () at src/haproxy.c:2865
#10 0x0000555555a1e20c in run_thread_poll_loop (data=0x5555569d1c00 <ha_thread_info>) at src/haproxy.c:3081
#11 0x0000555555a1f66b in main (argc=5, argv=0x7fffffffde18) at src/haproxy.c:3671
To fix this, change the condition prior to calling quic_conn release. If
<conn> member is not NULL, delay the release, similarly to the case when
MUX is allocated. This allows connection to be freed first, and detach
from quic_conn layer through close xprt operation.
No need to backport.
When fwlc_get_next_server(), if a server to avoid has been provided, and
we have to ignore it, don't forget to increase the number of unusable
servers, otherwise we may end up ignoring it over and over, never
switching to another server, in an infinite loop until the process gets
killed.
This hopefully fixes Github issues #3004 and #3014.
This should be backported to 3.2.
Implement attach and avail_streams mux-ops callbacks, which are used on
backend side for connection reuse.
Attach operation is used to initiate new streams on the connection
outside of the first one. It simply relies on qcc_init_stream_local() to
instantiate a new QCS instance, which is immediately linked to its
stream data layer.
Outside of attach, it is also necessary to implement avail_streams so
that the stream layer will try to initiate connection reuse. This method
reports the number of bidirectional streams which can still be opened
for the QUIC connection. It depends directly to the flow-control value
advertised by the peer. Thus, this ensures that attach won't cause any
flow control violation.
Prior to initiate first stream on the backend side, ensure that peer
flow-control allows at least that a single bidirectional stream can be
created. If this is not the case, abort MUX init operation.
Before this patch, flow-control limit was not checked. Hence, if peer
does not allow any bidirectional stream, haproxy would violate it, which
whould then cause the peer to close the connection.
Note that with the current situation, haproxy won't be able to talk to
servers which uses a 0 for initial max bidi streams. A proper solution
could be to pause the request until a MAX_STREAMS is received, under
timeout supervision to ensure the connection is closed if no frame is
received.
Implement support for MAX_STREAMS frame. On frontend, this was mostly
useless as haproxy would never initiate new bidirectional streams.
However, this becomes necessary to control stream flow-control when
using QUIC as a client on the backend side.
Parsing of MAX_STREAMS is implemented via new qcc_recv_max_streams().
This allows to update <ms_uni>/<ms_bidi> QCC fields.
This patch is necessary to achieve QUIC backend connection reuse.
Previously, no check on peer flow-control was implemented prior to open
a local QUIC stream. This was a small problem for frontend
implementation, as in this case haproxy as a server never opens
bidirectional streams.
On frontend, the only stream opened by haproxy in this case is for
HTTP/3 control unidirectional data. If the peer uses an initial value
for max uni streams set to 0, it would violate its flow control, and the
peer will probably close the connection. Note however that RFC 9114
mandates that each peer defines minimal initial value so that at least
the control stream can be created.
This commit improves the situation of too low initial max uni streams
value. Now, on HTTP/3 layer initialization, haproxy preemptively checks
flow control limit on streams via a new function
qcc_fctl_avail_streams(). If credit is already expired due to a too
small initial value, haproxy preemptively closes the connection using
H3_ERR_GENERAL_PROTOCOL_ERROR. This behavior is better as haproxy is now
the initiator of the connection closure.
This should be backported up to 2.8.
Remove avail_streams_bidi/avail_streams_uni mux_ops. These callbacks
were designed to be specific to QUIC. However, they won't be necessary,
as stream layer only cares about bidirectional streams.
Add some notes which load-balancing algorithm can be considered as
deterministic or non-deterministic and add some examples for each type.
This was asked via mailing list to clarify the usage of
prefer-last-server option.
This can be backported to all stable versions.
Add checks to ensure that :status pseudo-header received in HTTP/3
response is valid. If either the header is not provided, or it isn't a 3
digit numbers, the response is considered as invalid and the streams is
rejected. Also, glitch counter is now incremented in any of these cases.
This should fix coverity report from github issue #3009.
Convert BUG_ON_HOT() statements to BUG_ON() if HTX start-line is either
missing or duplicated when transcoding into a HTTP/3 request. This
ensures that such abnormal conditions will be detected even on default
builds.
This is linked to coverity report #3008.
Finalize HTTP/3 response transcoding into HTX message. This patch
implements conversion of HTTP/3 headers provided by the server into HTX
blocks.
Special checks have been implemented to reject connection-specific
headers, causing the stream to be shut in error. Also, handling of
content-length requires that the body size is equal to the value
advertized in the header to prevent HTTP desync.
On the backend side, HTTP/3 request response from server is transcoded
into a HTX message. Previously, a fixed value was used for the status
code.
Improve this by extracting the value specified by the server and set it
into the HTX status line. This requires to detect :status pseudo-header
from the HTTP/3 response.
Implement basic support for HTTP/3 request response transcoding into
HTX. This is done via a new dedicated function h3_resp_headers_to_htx().
A valid HTX status-line is allocated and stored. Status code is
hardcoded to 200 for now.
Following patches will be added to remove hardcoded status value and
also handle response headers provided by the server.
Refactor HTTP/3 request headers transcoding to HTX done in
h3_headers_to_htx(). Some operations are extracted into dedicated
functions, to check pseudo-headers and headers conformity, and also trim
the value of headers before encoding it in HTX.
The objective will be to simplify implementation of HTTP/3 response
transcoding by reusing these functions.
Also, h3_headers_to_htx() has been renamed to h3_req_headers_to_htx(),
to highlight that it is reserved to frontend usage.
Implement proper encoding of HTTP/3 authority pseudo-header during
request transcoding on the backend side. A pseudo-header :authority is
encoded if a value can be extracted from HTX start-line. A special check
is also implemented to ensure that a host header is not encoded if
:authority already is.
A new function qpack_encode_auth() is defined to implement QPACK
encoding of :authority header using literal field line with name ref.
Previously, HTTP/3 backend request :path was hardcoded to value '/'.
Change this so that we can now encode any path as requested by the
client. Path is extracted from the HTX URI. Also, qpack_encode_path() is
extended to support literal field line with name ref.
Previously, scheme was always set to https when transcoding an HTX
start-line into a HTTP/3 request. Change this so this conversion is now
fully compliant.
If no scheme is specified by the client, which is what happens most of
the time with HTTP/1, https is set for the HTTP/3 request. Else, reuse
the scheme requested by the client.
If either https or http is set, qpack_encode_scheme will encode it using
entry from QPACK static table. Else, a full literal field line with name
ref is used instead as the scheme value is specified as-is.
On the backend side, HTX start-line is converted into a HTTP/3 request
message. Previously, GET method was hardcoded. Implement proper method
conversion, by extracting it from the HTX start-line.
qpack_encode_method() has also been extended, so that it is able to
encode any method, either using a static table entry, or with a literal
field line with name ref representation.
Implement encoding of HTTP/3 request headers during HTX->H3 conversion
on the backend side. This simply relies on h3_encode_header().
Special check is implemented to ensure that connection-specific headers
are ignored. An HTTP/3 endpoint must never generate them, or the peer
will consider the message as malformed.
This commit is the first one of a serie which aim is to implement
transcoding of a HTX request into HTTP/3, which is necessary for QUIC
backend support.
Transcoding is implementing via a new function h3_req_headers_send()
when a HTX start-line is parsed. For now, most of the request fields are
hardcoded, using a GET method. This will be adjusted in the next
following patches.
On backend side, QUIC MUX needs to initialize the first local stream
during MUX init operation. This is necessary so that the first transfer
can then be performed.
sc_attach_mux() is used to attach the created QCS instance to its stream
data layer. However, return value was not checked, which may cause
issues on allocation error. This patch fixes it by returning an error on
MUX init operation and freeing the QCS instance in case of
sc_attach_mux() error.
This fixes coverity report from github issue #3007.
No need to backport.
When a connection error is reported, we try to collect as much information
as possible on the connection status and the server status is adjusted
accordingly. However, the function does nothing if there is no connection
error and if the healthcheck is not expired yet. It is a problem when an
internal error occurred. It may happen at many places and it is hard to be
sure an error is reported on the connection. And in fact, it is already a
problem when the multiplexer allocation fails. In that case, the healthcheck
is not interrupted as it should be. Concretely, it could only happen when a
connection is established.
It is hard to predict the effects of this bug. It may be unimportant. But it
could probably lead to a crash. To avoid any issue, a SOCKERR status is now
set by default when a connection error is reported. There is no reason to
report a connection error for nothing. So a healthcheck failure must be
reported. There is no "internal error" status. So a socket error is
reported.
This patch must be backport to all stable versions.
It is not especially a bug fixed. But APPCTX_FL_EOS and APPCTX_FL_ERROR
flags must be handled first. These flags are set by the applet itself and
should mark the end of all processing. So there is not reason to get the
output buffer in first place.
This patch could be backported as far as 3.0.
The output buffer must be available to process a command, at least to be
able to emit error messages. When this buffer is full or cannot be
allocated, we must wait. In that case, we must take care to notify the SE
will not consume input data. It is important to avoid wakeup in loop,
especially when the client aborts.
When the output buffer is available again and no longer full, and the CLI
applet is waiting for a command line, it must notify it will consume input
data.
This patch must be backported as far as 3.0.
QUIC support on the backend side has been implemented recently. This has
lead to some adjustment on qc_new_conn() to handle both FE and BE sides,
with some of these changes performed by the following commit.
29fb1aee57
MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn())
An issue was introduced during some code adjustement. Initialization of
ODCID was incorrectly performed, which caused haproxy to emit invalid
transport parameters. Most of the clients detected this and immediatly
closed the connection.
Fix this by adjusting qc_lstnr_params_init() invokation : replace
<qc.dcid>, which in fact points to the received SCID, by <qc.odcid>
whose purpose is dedicated to original DCID storage.
This fixes github issue #3006. This issue also caused the majority of
tests in the interop to fail.
No backport needed.
This patch is OpenSSL3.5 QUIC API specific. It fixes
OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() callback (see man(3) SSL_set_quic_tls_cb).
The role of this callback is to store the transport parameters received by the peer.
At this time it is never used by QUIC listeners because there is another callback
which is used to store the transport parameters. This latter callback is not specific
to OpenSSL 3.5 QUIC API. As far as I know, the TLS stack call only one time
one of the callbacks which have been set to receive and store the transport parameters.
That said, OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() is called for QUIC
backends to store the server transport parameters.
qc_ssl_set_quic_transport_params() is useless is this callback. It is dedicated
to store the local tranport parameters (which are sent to the peer). Furthermore
<server> second parameter of quic_transport_params_store() must be 0 for a listener
(or QUIC server) whichs call it, denoting it does not receive the transport parameters
of a QUIC server. It must be 1 for a QUIC backend (a QUIC client which receives
the transport parameter of a QUIC server).
Must be backported to 3.2.
On backend side, HTTP/0.9 response body is copied into stream data HTX
buffer. Properly handle the case where the HTX out buffer space is too
small. Only copy a partial copy of the HTTP response. Transcoding will
be restarted when new room is available.
When QUIC is used on the frontend side, communication is restricted with
clients using privileged port. This is a simple protection against
DNS/NTP spoofing.
This feature should not be activated on the backend side, as in this
case it is quite frequent to exchange with server running on privileged
ports. As such, a new parameter is added to quic_recv() so that it is
only active on the frontend side.
Without this patch, it is impossible to communicate with QUIC servers
running on privileged ports, as incoming datagrams would be silently
dropped.
No need to backport.
The keep-query redirect option must do nothing is there is no query-string.
However, there is a bug. When there is no QS, an error is returned, leading
to return a 500-internal-error to the client.
To fix the bug, instead of returning 0 when there is no QS, we just skip the
QS processing.
This patch should fix the issue #3005. It must be backported as far as 3.1.
NEW_TOKEN frame is never emitted by a client, hence parsing was not
tested on frontend side.
On backend side, an issue can occur, as expected token length is static,
based on the token length used internally by haproxy. This is not
sufficient for most server implementation which uses larger token. This
causes a parsing error, which may cause skipping of following frames in
the same packet. This issue was detected using ngtcp2 as server.
As for now tokens are unused by haproxy, simply discard test on token
length during NEW_TOKEN frame parsing. The token itself is merely
skipped without being stored. This is sufficient for now to continue on
experimenting with QUIC backend implementation.
This does not need to be backported.
Report an error during server configuration if QUIC is used by SSL is
not activiated via 'ssl' keyword. This is done in _srv_parse_finalize(),
which is both used by static and dynamic servers.
Note that contrary to listeners, an error is reported instead of a
warning, and SSL is not automatically activated if missing. This is
mainly due to the complex server configuration : _srv_parse_finalize()
is ideal to affect every servers, including dynamic entries. However, it
is executed after server SSL context allocation performed via
<prepare_srv> XPRT operation. A proper fix would be to move SSL ctx
alloc in _srv_parse_finalize(), but this may have unknown impact. Thus,
for now a simpler solution has been chosen.
QUIC traces in ssl_quic_srv_new_ssl_ctx() are problematic as this
function is called early during startup. If activating traces via -dt
command-line argument, a crash occurs due to stderr sink not yet
available.
Thus, traces from ssl_quic_srv_new_ssl_ctx() are simply removed.
No backport needed.
This commit added a "err" C label reachable only with USE_QUIC_OPENSSL_COMPAT:
MINOR: quic-be: Missing callbacks initializations (USE_QUIC_OPENSSL_COMPAT)
leading coverity to warn this:
*** CID 1611481: Control flow issues (UNREACHABLE)
/src/quic_ssl.c: 802 in ssl_quic_srv_new_ssl_ctx()
796 goto err;
797 #endif
798
799 leave:
800 TRACE_LEAVE(QUIC_EV_CONN_NEW);
801 return ctx;
>>> CID 1611481: Control flow issues (UNREACHABLE)
>>> This code cannot be reached: "err:
SSL_CTX_free(ctx);".
802 err:
803 SSL_CTX_free(ctx);
804 ctx = NULL;
805 TRACE_DEVEL("leaving on error", QUIC_EV_CONN_NEW);
806 goto leave;
807 }
The less intrusive (without #ifdef) way to fix this it to add a "goto err"
statement from the code part which is reachable without USE_QUIC_OPENSSL_COMPAT.
Thank you to @chipitsine for having reported this issue in GH #3003.
This issue may occur when qc_new_conn() fails after having allocated
and attached <conn_cid> to its tree. This is the case when compiling
haproxy against WolfSSL for an unknown reason at this time. In this
case the <conn_cid> is freed by pool_head_quic_connection_id(), then
freed again by quic_conn_release().
This bug arrived with this commit:
MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn())
So, the aim of this patch is to free <conn_cid> only for QUIC backends
and if it is not attached to its tree. This is the case when <conn_id>
local variable passed with NULL value to qc_new_conn() is then intialized
to the same <conn_cid> value.
This patch should have come with this last commit for the last qc_new_conn()
modifications for QUIC backends:
MINOR: quic-be: get rid of ->li quic_conn member
qc_new_conn() must be passed NULL pointers for several variables as mentioned
by the comment. Some of these local variables are used to avoid too much
code modifications.
Implement transcoding of a HTX request into HTTP/0.9. This protocol is a
simplified version of HTTP. Request only supports GET method without any
header. As such, only a request line is written during snd_buf
operation.
Implement transcoding of a HTTP/0.9 response into a HTX message.
HTTP/0.9 is a really simple substract of HTTP spec. The response does
not have any status line and is contains only the payload body. Response
is finished when the underlying connection/stream is closed.
A status line is generated to be compliant with HTX. This is performed
on the first invokation of rcv_buf for the current stream. Status code
is set to 200. Payload body if present is then copied using
htx_add_data().
This commit is the second and final step to initiate QUIC MUX on the
backend side. On handshake completion, MUX is woken up just after its
creation. This step is necessary to notify the stream layer, via the QCS
instance pre-initialized on MUX init, so that the transfer can be
resumed.
This mode of operation is similar to TCP stack when TLS+ALPN are used,
which forces MUX initialization to be delayed after handshake
completion.
Adjust qmux_init() to handle frontend and backend sides differently.
Most notably, on backend side, the first bidirectional stream is created
preemptively. This step is necessary as MUX layer will be woken up just
after handshake completion.
Stream data layer is notified that data is expected when FIN is
received, which marks the end of the HTTP request. This prepares data
layer to be able to handle the expected HTTP response.
Thus, this step is only relevant on frontend side. On backend side, FIN
marks the end of the HTTP response. No further content is expected, thus
expect data should not be set in this case.
Note that se_expect_data() invokation via qcs_attach_sc() is not
protected. This is because this function will only be called during
request headers parsing which is performed on the frontend side.
Mux connection is flagged with new QC_CF_IS_BACK if used on the backend
side. For now the only change is during traces, to be able to
differentiate frontend and backend usage.
Complete document for rcv_buf/snd_buf operations. In particular, return
value is now explicitely defined. For H3 layer, associated functions
documentation is also extended.
Use conn_ctrl_init() on the connection when quic_connect_server()
succeeds. This is necessary so that the connection is considered as
completely initialized. Without this, connect operation will be call
again if connection is reused.
On backend side, multiplexer layer is initialized during
connect_server(). However, this step is not performed if ALPN is used,
as the negotiated protocol may be unknown. Multiplexer initialization is
delayed after TLS handshake completion.
There are still exceptions though that forces the MUX to be initialized
even if ALPN is used. One of them was if <mux_proto> server field was
already set at this stage, which is the case when an explicit proto is
selected on the server line configuration. Remove this condition so that
now MUX init is delayed with ALPN even if proto is forced.
The scope of this change should be minimal. In fact, the only impact
concerns server config with both proto and ALPN set, which is pretty
unlikely as it is contradictory.
The main objective of this patch is to prepare QUIC support on the
backend side. Indeed, QUIC proto will be forced on the server if a QUIC
address is used, similarly to bind configuration. However, we still want
to delay MUX initialization after QUIC handshake completion. This is
mandatory to know the selected application protocol, required during
QUIC MUX init.
Change wake callback behavior for QUIC MUX. This operation loops over
each QCS and notify their stream data layer on certain events via
internal helper qcc_wake_some_streams().
Previously, streams were notified only if an error occured on the
connection. Change this to notify streams data layer everytime wake
callback is used. This behavior is now identical to H2 MUX.
qcc_wake_some_streams() is also renamed to qcc_wake_streams(), as it
better reflect its true behavior.
This change should not have performance impact as wake mux ops should
not be called frequently. Note that qcc_wake_streams() can also be
called directly via qcc_io_process() to ensure a new error is correctly
propagated. As wake callback first uses qcc_io_process(), it will only
call qcc_wake_streams() if no error is present.
No known issue is associated with this commit. However, it could prevent
freezing transfer under certain condition. As such, it is considered as
a bug fix worthy of backporting.
This should be backported after a period of observation.
It was still failing on Ubuntu-24.04 with GCC+ASAN. So, instead of
understand the code path the compiler followed to report uninitialized
variables, let's init them now.
No backport needed.
commit 16eb0fab3 ("MAJOR: counters: dispatch counters over thread groups")
introduced a build regression on some compilers:
src/listener.c: In function 'listener_accept':
src/listener.c:1095:3: error: 'for' loop initial declarations are only allowed in C99 mode
for (int it = 0; it < global.nbtgroups; it++)
^
src/listener.c:1095:3: note: use option -std=c99 or -std=gnu99 to compile your code
src/listener.c:1101:4: error: 'for' loop initial declarations are only allowed in C99 mode
for (int it = 0; it < global.nbtgroups; it++) {
^
make: *** [src/listener.o] Error 1
make: *** Waiting for unfinished jobs....
Let's fix that.
No backport needed
In hlua_applet_tcp_recv_try() and hlua_applet_tcp_getline_yield(), GCC 14.2
reports warnings about 'blk2' variable that may be used uninitialized. It is
a bit strange because the code is pretty similar than before. But to make it
happy and to avoid bugs if the API change in future, 'blk2' is now used only
when its length is greater than 0.
No need to backport.
In hlua_applet_tcp_getline_yield(), the function may yield if there is no
data available. However we must take care to add a return statement just
after the call to hlua_yieldk(). I don't know the details of the LUA API,
but at least, this return statement fix a build error about uninitialized
variables that may be used.
It is a 3.3-specific issue. No backport needed.
On backend side, MUX is instantiated after QUIC handshake completion.
This step is performed via qc_ssl_provide_quic_data(). First, connection
flags for handshake completion are resetted. Then, MUX is instantiated
via conn_create_mux() function.
Force QUIC as <mux_proto> for server if a QUIC address is used. This is
similarly to what is already done for bind instances on the frontend
side. This step ensures that conn_create_mux() will select the proper
protocol.
Replace ->li quic_conn pointer to struct listener member by ->target which is
an object type enum and adapt the code.
Use __objt_(listener|server)() where the object type is known. Typically
this is were the code which is specific to one connection type (frontend/backend).
Remove <server> parameter passed to qc_new_conn(). It is redundant with the
<target> parameter.
GSO is not supported at this time for QUIC backend. qc_prep_pkts() is modified
to prevent it from building more than an MTU. This has as consequence to prevent
qc_send_ppkts() to use GSO.
ssl_clienthello.c code is run only by listeners. This is why __objt_listener()
is used in place of ->li.
Disable the code around SSL_get_peer_quic_transport_params() as this was done
for USE_QUIC_OPENSSL_COMPAT because SSL_get_peer_quic_transport_params() is not
defined by OpenSSL 3.5 QUIC API.
quic_tls_compat_keylog_callback() is the callback used by the QUIC OpenSSL
compatibility module to derive the TLS secrets from other secrets provided
by keylog. The <write> local variable to this function is initialized to denote
the direction (write to send, read to receive) the secret is supposed to be used
for. That said, as the QUIC cryptographic algorithms are symmetrical, the
direction is inversed between the peer: a secret which is used to write/send/cipher
data from a peer point of view is also the secret which is used to
read/receive/decipher data. This was confirmed by the fact that without this
patch, the TLS stack first provides the peer with Handshake to send/cipher
data. The client could not use such secret to decipher the Handshake packets
received from the server. This patch simply reverse the direction stored by
<write> variable to make the secrets derivation works for the QUIC client.
quic_tls_compat_init() function is called from OpenSSL QUIC compatibility module
(USE_QUIC_OPENSSL_COMPAT) to initialize the keylog callback and the callback
which stores the QUIC transport parameters as a TLS extensions into the stack.
These callbacks must also be initialized for QUIC backends.
This is done from TLS secrets derivation callback at Application level (the last
encryption level) calling SSL_get_peer_quic_transport_params() to have an access
to the TLS transport paremeters extension embedded into the Server Hello TLS message.
Then, quic_transport_params_store() is called to store a decoded version of
these transport parameters.
For connection to QUIC servers, this patch modifies the moment where the I/O
handler callback is switched to quic_conn_app_io_cb(). This is no more
done as for listener just after the handshake has completed but just after
it has been confirmed.
Discard the Initial packet number space as soon as possible. This is done
during handshakes in quic_conn_io_cb() as soon as an Handshake packet could
be successfully sent.
The initialization of <ssl_app_data_index> SSL user data index is required
to make all the SSL sessions to QUIC servers work as this is done for TCP
servers. The conn object notably retrieve for SSL callback which are
server specific (e.g. ssl_sess_new_srv_cb()).
Store the peer connection ID (SCID) as the connection DCID as soon as an Initial
packet is received.
Stop comparing the packet to QUIC_PACKET_TYPE_0RTT is already match as
QUIC_PACKET_TYPE_INITIAL.
A QUIC server must not send too short datagram with ack-eliciting packets inside.
This cannot be done from quic_rx_pkt_parse() because one does not know if
there is ack-eliciting frame into the Initial packets. If the packet must be
dropped, this is after having parsed it!
Modify quic_dgram_parse() to stop passing it a listener as third parameter.
In place the object type address of the connection socket owner is passed
to support the haproxy servers with QUIC as transport protocol.
qc_owner_obj_type() is implemented to return this address.
qc_counters() is also implemented to return the QUIC specific counters of
the proxy of owner of the connection.
quic_rx_pkt_parse() called by quic_dgram_parse() is also modify to use
the object type address used by this latter as last parameter. It is
also modified to send Retry packet only from listeners. A QUIC client
(connection to haproxy QUIC servers) must drop the Initial packets with
non null token length. It is also not supposed to receive O-RTT packets
which are dropped.
The QUIC datagram redispatch is there to counter the race condition which
exists only for QUIC connections to listener where datagrams may arrive
on the wrong socket between the bind() and connect() calls.
Run this code part only for listeners.
Allocate a connection to connect to QUIC servers from qc_conn_init() which is the
->init() QUIC xprt callback.
Also initialize ->prepare_srv and ->destroy_srv callback as this done for TCP
servers.
For haproxy QUIC servers (or QUIC clients), the peer is considered as validated.
This is a property which is more specific to QUIC servers (haproxy QUIC listeners).
No <odcid> is used for the QUIC client connection. It is used only on the QUIC server side.
The <token_odcid> is also not used on the QUIC client side. It must be embedded into
the transport parameters only on the QUIC server side.
The quic_conn is created before the socket allocation. So, the local address is
zeroed.
Initilize the transport parameter with qc_srv_params_init().
Stop hardcoding the <server> parameter passed value to qc_new_isecs() to correctly
initialize the Initial secrets.
Modify quic_connect_server() which is the ->connect() callback for QUIC protocol:
- add a BUG_ON() run when entering this funtion: the <fd> socket must equal -1
- conn->handle is a union. conn->handle.qc is use for QUIC connection,
conn->handle.fd must not be used to store the fd.
- code alignment fix for setsockopt(fd, SOL_SOCKET, (SO_SNDBUF|SO_RCVBUF))
statements
- remove the section of code which was duplicated from ->connect() TCP callback
- fd_insert() the new socket file decriptor created to connect to the QUIC
server with quic_conn_sock_fd_iocb() as callback for read event.
This patch only adds <proto_type> new proto_type enum parameter and <sock_type>
socket type parameter to sock_create_server_socket() and adapts its callers.
This is to prepare the use of this function by QUIC servers/backends.
Modify qc_alloc_ssl_sock_ctx() to pass the connection object as parameter. It is
NULL for a QUIC listener, not NULL for a QUIC server. This connection object is
set as value for ->conn quic_conn struct member. Initialise the SSL session object from
this function for QUIC servers.
qc_ssl_set_quic_transport_params() is also modified to pass the SSL object as parameter.
This is the unique parameter this function needs. <qc> parameter is used only for
the trace.
SSL_do_handshake() must be calle as soon as the SSL object is initialized for
the QUIC backend connection. This triggers the TLS CRYPTO data delivery.
tasklet_wakeup() is also called to send asap these CRYPTO data.
Modify the QUIC_EV_CONN_NEW event trace to dump the potential errors returned by
SSL_do_handshake().
Implement ssl_sock_new_ssl_ctx() to allocate a SSL server context as this is currently
done for TCP servers and also for QUIC servers depending on the <is_quic> boolean value
passed as new parameter. For QUIC servers, this function calls ssl_quic_srv_new_ssl_ctx()
which is specific to QUIC.
From connect_server(), QUIC protocol could not be retreived by protocol_lookup()
because of the PROTO_TYPE_STREAM default passed as argument. In place to support
QUIC srv->addr_type.proto_type may be safely passed.
The QUIC servers xprts have already been set at server line parsing time.
This patch prevents the QUIC servers xprts to be reset to <ssl_sock> value which is
the value used for SSL/TCP connections.
Add ->quic_params new member to server struct.
Also set the ->xprt member of the server being initialized and initialize asap its
transport parameters from _srv_parse_init().
This XPRT callback is called from check_config_validity() after the configuration
has been parsed to initialize all the SSL server contexts.
This patch implements the same thing for the QUIC servers.
Add a little check to verify that the version chosen by the server matches
with the client one. Initiliazes local transport parameters ->negotiated_version
value with this version if this is the case. If not, return 0;
According to the RFC, a QUIC client must encode the QUIC version it supports
into the "Available Versions" of "Version Information" transport parameter
order by descending preference.
This is done defining <quic_version_2> and <quic_version_draft_29> new variables
pointers to the corresponding version of <quic_versions> array elements.
A client announces its available versions as follows: v1, v2, draft29.
Activate QUIC protocol support for MUX-QUIC on the backend side,
additionally to current frontend support. This change is mandatory to be
able to implement QUIC on the backend side.
Without this modification, it is impossible to activate explicitely QUIC
protocol on a server line, hence an error is reported :
config : proxy 'xxxx' : MUX protocol 'quic' is not usable for server 'yyyy'
Mark QUIC address support for servers as experimental on the backend
side. Previously, it was allowed but wouldn't function as expected. As
QUIC backend support requires several changes, it is better to declare
it as experimental first.
QUIC is not implemented on the backend side. To prevent any issue, it is
better to reject any server configured which uses it. This is done via
_srv_parse_init() which is used both for static and dynamic servers.
This should be backported up to all stable versions.
Released version 3.3-dev1 with the following main changes :
- BUILD: tools: properly define ha_dump_backtrace() to avoid a build warning
- DOC: config: Fix a typo in 2.7 (Name format for maps and ACLs)
- REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (5)
- REGTESTS: Remove REQUIRE_VERSION=2.3 from all tests
- REGTESTS: Remove REQUIRE_VERSION=2.4 from all tests
- REGTESTS: Remove tests with REQUIRE_VERSION_BELOW=2.4
- REGTESTS: Remove support for REQUIRE_VERSION and REQUIRE_VERSION_BELOW
- MINOR: server: group postinit server tasks under _srv_postparse()
- MINOR: stats: add stat_col flags
- MINOR: stats: add ME_NEW_COMMON() helper
- MINOR: proxy: collect per-capability stat in proxy_cond_disable()
- MINOR: proxy: add a true list containing all proxies
- MINOR: log: only run postcheck_log_backend() checks on backend
- MEDIUM: proxy: use global proxy list for REGISTER_POST_PROXY_CHECK() hook
- MEDIUM: server: automatically add server to proxy list in new_server()
- MEDIUM: server: add and use srv_init() function
- BUG/MAJOR: leastconn: Protect tree_elt with the lbprm lock
- BUG/MEDIUM: check: Requeue healthchecks on I/O events to handle check timeout
- CLEANUP: applet: Update comment for applet_put* functions
- DEBUG: check: Add the healthcheck's expiration date in the trace messags
- BUG/MINOR: mux-spop: Fix null-pointer deref on SPOP stream allocation failure
- CLEANUP: sink: remove useless cleanup in sink_new_from_logger()
- MAJOR: counters: add shared counters base infrastructure
- MINOR: counters: add shared counters helpers to get and drop shared pointers
- MINOR: counters: add common struct and flags to {fe,be}_counters_shared
- MEDIUM: counters: manage shared counters using dedicated helpers
- CLEANUP: counters: merge some common counters between {fe,be}_counters_shared
- MINOR: counters: add local-only internal rates to compute some maxes
- MAJOR: counters: dispatch counters over thread groups
- BUG/MEDIUM: cli: Properly parse empty lines and avoid crashed
- BUG/MINOR: config: emit warning for empty args only in discovery mode
- BUG/MINOR: config: fix arg number reported on empty arg warning
- BUG/MINOR: quic: Missing SSL session object freeing
- MINOR: applet: Add API functions to manipulate input and output buffers
- MINOR: applet: Add API functions to get data from the input buffer
- CLEANUP: applet: Simplify a bit comments for applet_put* functions
- MEDIUM: hlua: Update TCP applet functions to use the new applet API
- BUG/MEDIUM: fd: Use the provided tgid in fd_insert() to get tgroup_info
- BUG/MINIR: h1: Fix doc of 'accept-unsafe-...-request' about URI parsing
The description of tests performed on the URI in H1 when
'accept-unsafe-violations-in-http-request' option is wrong. It states that
only characters below 32 and 127 are blocked when this option is set,
suggesting that otherwise, when it is not set, all invalid characters in the
URI, according to the RFC3986, are blocked.
But in fact, it is not true. By default all character below 32 and above 127
are blocked. And when 'accept-unsafe-violations-in-http-request' option is
set, characters above 127 (excluded) are accepted. But characters in
(33..126) are never checked, independently of this option.
This patch should fix the issue #2906. It should be backported as far as
3.0. For older versions, the docuementation could also be clarified because
this part is not really clear.
Note the request URI validation is still under discution because invalid
characters in (33.126) are never checked and some users request a stricter
parsing.
In fd_insert(), use the provided tgid to ghet the thread group info,
instead of using the one of the current thread, as we may call
fd_insert() from a thread of another thread group, that will happen at
least when binding the listeners. Otherwise we'd end up accessing the
thread mask containing enabled thread of the wrong thread group, which
can lead to crashes if we're binding on threads not present in the
thread group.
This should fix Github issue #2991.
This should be backported up to 2.8.
The functions responsible to extract data from the applet input buffer or to
push data into the applet output buffer are now relying on the newly added
functions in the applet API. This simplifies a bit the code.
There was already functions to pushed data from the applet to the stream by
inserting them in the right buffer, depending the applet was using or not
the legacy API. Here, functions to retreive data pushed to the applet by the
stream were added:
* applet_getchar : Gets one character
* applet_getblk : Copies a full block of data
* applet_getword : Copies one text block representing a word using a
custom separator as delimiter
* applet_getline : Copies one text line
* applet_getblk_nc : Get one or two blocks of data
* applet_getword_nc: Gets one or two blocks of text representing a word
using a custom separator as delimiter
* applet_getline_nc: Gets one or two blocks of text representing a line
In this patch, some functions were added to ease input and output buffers
manipulation, regardless the corresponding applet is using its own buffers
or it is relying on channels buffers. Following functions were added:
* applet_get_inbuf : Get the buffer containing data pushed to the applet
by the stream
* applet_get_outbuf : Get the buffer containing data pushed by the applet
to the stream
* applet_input_data : Return the amount of data in the input buffer
* applet_skip_input : Skips <len> bytes from the input buffer
* applet_reset_input: Skips all bytes from the input buffer
* applet_output_room: Returns the amout of space available at the output
buffer
* applet_need_room : Indicates that the applet have more data to deliver
and it needs more room in the output buffer to do
so
qc_alloc_ssl_sock_ctx() allocates an SSL_CTX object for each connection. It also
allocates an SSL object. When this function failed, it freed only the SSL_CTX object.
The correct way to free both of them is to call qc_free_ssl_sock_ctx().
Must be backported as far as 2.6.
If an empty argument is used in configuration, for example due to an
undefined environment variable, the rest of the line is not parsed. As
such, a warning is emitted to report this.
The warning was not totally correct as it reported the wrong argument
index. Fix this by this patch. Note that there is still an issue with
the "^" indicator, but this is not as easy to fix yet.
This is related to github issue #2995.
This should be backported up to 3.2.
Hide warning about empty argument outside of discovery mode. This is
necessary, else the message will be displayed twice, which hampers
haproxy output lisibility.
This should fix github isue #2995.
This should be backported up to 3.2.
Empty lines was not properly parsed and could lead to crashes because the
last argument was parsed outside of the cmdline buffer. Indeed, the last
argument is parsed to look for an eventual payload pattern. It is started
one character after the newline at the end of the command line. But it is
only valid for an non-empty command line.
So, now, this case is properly detected when we leave if an empty line is
detected.
This patch must be backported to 3.2.
Most fe and be counters are good candidates for being shared between
processes. They are now grouped inside "shared" struct sub member under
be_counters and fe_counters.
Now they are properly identified, they would greatly benefit from being
shared over thread groups to reduce the cost of atomic operations when
updating them. For this, we take the current tgid into account so each
thread group only updates its own counters. For this to work, it is
mandatory that the "shared" member from {fe,be}_counters is initialized
AFTER global.nbtgroups is known, because each shared counter causes the stat
to be allocated lobal.nbtgroups times. When updating a counter without
concurrency, the first counter from the array may be updated.
To consult the shared counters (which requires aggregation of per-tgid
individual counters), some helper functions were added to counter.h to
ease code maintenance and avoid computing errors.
cps_max (max new connections received per second), sps_max (max new
sessions per second) and http.rps_max (maximum new http requests per
second) all rely on shared counters (namely conn_per_sec, sess_per_sec and
http.req_per_sec). The problem is that shared counters are about to be
distributed over thread groups, and we cannot afford to compute the
total (for all thread groups) each time we update the max counters.
Instead, since such max counters (relying on shared counters) are a very
few exceptions, let's add internal (sess,conn,req) per sec freq counters
that are dedicated to cps_max, sps_max and http.rps_max computing.
Thanks to that, related *_max counters shouldn't be negatively impacted
by the thread-group distribution, yet they will not benefit from it
either. Related internal freq counters are prefixed with "_" to emphasize
the fact that they should not be used for other purpose (the shared ones,
which are about to be distributed over thread groups in upcoming commits
are still available and must be used instead). The internal ones could
eventually be removed at any time if we find another way to compute the
{cps,sps,http.rps)_max counters.
Now that we have a common struct between fe and be shared counters struct
let's perform some cleanup to merge duplicate members into the common
struct part. This will ease code maintenance.
proxies, listeners and server shared counters are now managed via helpers
added in one of the previous commits.
When guid is not set (ie: when not yet assigned), shared counters pointer
is allocated using calloc() (local memory) and a flag is set on the shared
counters struct to know how to manipulate (and free it). Else if guid is
set, then it means that the counters may be shared so while for now we
don't actually use a shared memory location the API is ready for that.
The way it works, for proxies and servers (for which guid is not known
during creation), we first call counters_{fe,be}_shared_get with guid not
set, which results in local pointer being retrieved (as if we just
manually called calloc() to retrieve a pointer). Later (during postparsing)
if guid is set we try to upgrade the pointer from local to shared.
Lastly, since the memory location for some objects (proxies and servers
counters) may change from creation to postparsing, let's update
counters->last_change member directly under counters_{fe,be}_shared_get()
so we don't miss it.
No change of behavior is expected, this is only preparation work.
fe_counters_shared and be_counters_shared may share some common members
since they are quite similar, so we add a common struct part shared
between the two. struct counters_shared is added for convenience as
a generic pointer to manipulate common members from fe or be shared
counters pointer.
Also, the first common member is added: shared fe and be counters now
have a flags member.
create include/haproxy/counters.h and src/counters.c files to anticipate
for further helpers as some counters specific tasks needs to be carried
out and since counters are shared between multiple object types (ie:
listener, proxy, server..) we need generic helpers.
Add some shared counters helper which are not yet used but will be updated
in upcoming commits.
Shareable counters are not tagged as shared counters and are dynamically
allocated in separate memory area as a prerequisite for being stored
in shared memory area. For now, GUID and threads groups are not taken into
account, this is only a first step.
also we ensure all counters are now manipulated using atomic operations,
namely, "last_change" counter is now read from and written to using atomic
ops.
Despite the numerous changes caused by the counters being moved away from
counters struct, no change of behavior should be expected.
As reported by Ilya in GH #2994, some cleanup parts in
sink_new_from_logger() function are not used.
We can actually simplify the cleanup logic to remove dead code, let's
do that by renaming "error_final" label to "error" and only making use
of the "error" label, because sink_free() already takes care of proper
cleanup for all sink members.
When we try to allocate a new SPOP stream, if an error is encountered,
spop_strm_destroy() is called to released the eventually allocated
stream. But, it must only be called if a stream was allocated. If the
reported error is an SPOP stream allocation failure, we must just leave to
avoid null-pointer dereference.
This patch should fix point 1 of the issue #2993. It must be backported as
far as 3.1.
These functions were copied from the channel API and modified to work with
applets using the new API or the legacy one. However, the comments were
updated accordingly. It is the purpose of this patch.
When a healthchecks is processed, once the first wakeup passed to start the
check, and as long as the expiration timer is not reached, only I/O events
are able to wake it up. It is an issue when there is a check timeout
defined. Especially if the connect timeout is high and the check timeout is
low. In that case, the healthcheck's task is never requeue to handle any
timeout update. When the connection is established, the check timeout is set
to replace the connect timeout. It is thus possible to report a success
while a timeout should be reported.
So, now, when an I/O event is handled, the healthcheck is requeue, except if
an success or an abort is reported.
Thanks to Thierry Fournier for report and the reproducer.
This patch must be backported to all stable versions.
In fwlc_srv_reposition(), set the server's tree_elt while we still hold
the lbprm read lock. While it was protected from concurrent
fwlc_srv_reposition() calls by the server's lb_lock, it was not from
dequeuing/requeuing that could occur if the server gets down/up or its
weight is changed, and that would lead to inconsistencies, and the
watchdog killing the process because it is stuck in an infinite loop in
fwlc_get_next_server().
This hopefully fixes github issue #2990.
This should be backported to 3.2.
rename _srv_postparse() internal function to srv_init() function and group
srv_init_per_thr() plus idle conns list init inside it. This way we can
perform some simplifications as srv_init() performs multiple server
init steps after parsing.
SRV_F_CHECKED flag was added, it is automatically set when srv_init()
runs successfully. If the flag is already set and srv_init() is called
again, nothing is done. This permis to manually call srv_init() earlier
than the default POST_CHECK hook when needed without risking to do things
twice.
while new_server() takes the parent proxy as argument and even assigns
srv->proxy to the parent proxy, it didn't actually inserted the server
to the parent proxy server list on success.
The result is that sometimes we add the server to the list after
new_server() is called, and sometimes we don't.
This is really error-prone and because of that hooks such as
REGISTER_POST_SERVER_CHECK() which as run for all servers listed in
all proxies may not be relied upon for servers which are not actually
inserted in their parent proxy server list. Plus it feels very strange
to have a server that points to a proxy, but then the proxy doesn't know
about it because it cannot find it in its server list.
To prevent errors and make proxy->srv list reliable, we move the insertion
logic directly under new_server(). This requires to know if we are called
during parsing or during runtime to either insert or append the server to
the parent proxy list. For that we use PR_FL_CHECKED flag from the parent
proxy (if the flag is set, then the proxy was checked so we are past the
init phase, thus we assume we are called during runtime)
This implies that during startup if new_server() has to be cancelled on
error paths we need to call srv_detach() (which is now exposed in server.h)
before srv_drop().
The consequence of this commit is that REGISTER_POST_SERVER_CHECK() should
not run reliably on all servers created using new_server() (without having
to manually loop on global servers_list)
REGISTER_POST_PROXY_CHECK() used to iterate over "main" proxies to run
registered callbacks. This means hidden proxies (and their servers) did
not get a chance to get post-checked and could cause issues if some post-
checks are expected to be executed on all proxies no matter their type.
Instead we now rely on the global proxies list. Another side effect is that
the REGISTER_POST_SERVER_CHECK() now runs as well for servers from proxies
that are not part of the main proxies list.
postcheck_log_backend() checks are executed no matter if the proxy
actually has the backend capability while the checks actually depend
on this.
Let's fix that by adding an extra condition to ensure that the BE
capability is set.
This issue is not tagged as a bug because for now it remains impossible
to have a syslog proxy without BE capability in the main proxy list, but
this may change in the future.
We have global proxies_list pointer which is announced as the list of
"all existing proxies", but in fact it only represents regular proxies
declared on the config file through "listen, frontend or backend" keywords
It is ambiguous, and we currently don't have a straightforwrd method to
iterate over all proxies (either public or internal ones) within haproxy
Instead we still have to manually iterate over multiple lists (main
proxies, log-forward proxies, peer proxies..) which is error-prone.
In this patch we add a struct list member (8 bytes) inside struct proxy
in order to store every proxy (except default ones) within a global
"proxies" list which is actually representative for all proxies existing
under haproxy process, like we already have for servers.
proxy_cond_disable() collects and prints cumulated connections for be and
fe proxies no matter their type. With shared stats it may cause issues
because depending on the proxy capabilities only fe or be counters may
be allocated.
In this patch we add some checks to ensure we only try to read from
valid memory locations, else we rely on default values (0).
init_srv_requeue() and init_srv_slowstart() functions are called after
initial server parsing via REGISTER_POST_SERVER_CHECK() hook, and they
are also manually called for dynamic server after the server is
initialized.
This may conflict with _srv_postparse() which is also registered via
REGISTER_POST_SERVER_CHECK() and called during dynamic server creation
To ensure functions don't conflict with each other, let's ensure they
are executed in proper order by calling init_srv_requeue and
init_srv_slowstart() from _srv_postparse() which now becomes the parent
function for server related postparsing stuff. No change of behavior is
expected.
Introduced in:
25bcdb1d9 BUG/MAJOR: h1: Be stricter on request target validation during message parsing
see also:
fbbbc33df REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+
In resolve_sym_name() we declare a few symbols that we want to be able
to resolve. ha_dump_backtrace() was declared with a struct buffer instead
of a pointer to such a struct, which has no effect since we only want to
get the function's pointer, but produces a build warning with LTO, so
let's fix it.
This can be backported to 3.0.
Released version 3.2.0 with the following main changes :
- MINOR: promex: Add agent check status/code/duration metrics
- MINOR: ssl: support strict-sni in ssl-default-bind-options
- MINOR: ssl: also provide the "tls-tickets" bind option
- MINOR: server: define CLI I/O handler for "add server"
- MINOR: server: implement "add server help"
- MINOR: server: use stress mode for "add server help"
- BUG/MEDIUM: server: fix crash after duplicate GUID insertion
- BUG/MEDIUM: server: fix potential null-deref after previous fix
- MINOR: config: list recently added sections with -dKcfg
- BUG/MAJOR: cache: Crash because of wrong cache entry deleted
- DOC: configuration: fix the example in crt-store
- DOC: config: clarify the wording around single/double quotes
- DOC: config: clarify the legacy cookie and header captures
- DOC: config: fix alphabetical ordering of layer 7 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 6 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 5 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 4 sample fetch functions
- DOC: config: fix alphabetical ordering of internal sample fetch functions
- BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request
- BUG/MEDIUM: h3: Declare absolute URI as normalized when a :authority is found
- DOC: config: mention in bytes_in and bytes_out that they're read on input
- DOC: config: clarify the basics of ACLs (call point, multi-valued etc)
- REGTESTS: Make the script testing conditional set-var compatible with Vtest2
- REGTESTS: Explicitly allow failing shell commands in some scripts
- MINOR: listeners: Add support for a label on bind line
- BUG/MEDIUM: cli/ring: Properly handle shutdown in "show event" I/O handler
- BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API
- BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers
- BUG/MEDIUM: hlua: Fix receive API for TCP applets to properly handle shutdowns
- CI: vtest: Rely on VTest2 to run regression tests
- CI: vtest: Fix the build script to properly work on MaOS
- CI: combine AWS-LC and AWS-LC-FIPS by template
- BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused
- DOC: hlua: Add a note to warn user about httpclient object reuse
- DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation
- DEV: patchbot: prepare for new version 3.3-dev
- MINOR: version: mention that it's 3.2 LTS now.
A few typos were noticed while gathering info for the 3.2 announce
messages, this fixes them, and will probably constitute the last
commit of this release. There's no need to backport it unless commit
94055a5e7 ("MEDIUM: hlua: Add function to change the body length of
an HTTP Message") is backported.
It is not supported to reuse an lua httpclient instance to process several
requests. A new object must be created for each request. Thanks to the
previous patch ("BUG/MEDIUM: httpclient: Throw an error if an lua httpclient
instance is reused"), an error is now reported if this happens. But it is
not obvious for users. So the lua-api docuementation was updated accordingly.
This patch is related to issue #2986. It should be backported with the
commit above.
It is not expected/supported to reuse an httpclient instance to process
several requests. A new instance must be created for each request. However,
in lua, there is nothing to prevent a user to create an httpclient object
and use it in a loop to process requests.
That's unfortunate because this will apparently work, the requests will be
sent and a response will be received and processed. However internally some
ressources will be allocated and never released. When the next response is
processed, the ressources allocated for the previous one are definitively
lost.
In this patch we take care to check that the httpclient object was never
used when a request is sent from a lua script by checking
HTTPCLIENT_FS_STARTED flags. This flag is set when a httpclient applet is
spawned to process a request and never removed after that. In lua, the
httpclient applet is created when the request is sent. So, it is the right
place to do this test.
This patch should fix the issue #2986. It should be backported as far as
2.6.
VTest2 (https://github.com/vtest/VTest2) was released and is a remplacement
for VTest. VTest was archived. So let's use the new version now.
If this commit is backported, the 2 following commits must also be
backported:
* 2808e3577 ("REGTESTS: Explicitly allow failing shell commands in some scripts")
* 82c291124 ("REGTESTS: Make the script testing conditional set-var compatible with Vtest2")
An optional timeout was added to AppletTCP.receive() to interrupt calls after a
delay. It was mandatory to be able to implement interactive applets (like
trisdemo). However, this broke the API and it made impossible to differentiate
the shutdowns from the delays expirations. Indeed, in both cases, an empty
string was returned.
Because historically an empty string was used to notify a connection shutdown,
it should not be changed. So now, 'nil' value is returned when no data was
available before the delay expiration.
The new AppletTCP:try_receive() function was also affected. To fix it, instead
of stating there is no delay when a receive is tried, an expired delay is
set. Concretely TICK_ETERNITY was replaced by now_ms.
Finally, AppletTCP:getline() function is not concerned for now because there
is no way to interrupt it after some delay.
The documentation and trisdemo lua script were updated accordingly.
This patch depends on "BUG/MEDIUM: hlua: Properly detect shudowns for TCP
applets based on the new API". However, it is a 3.2-specific issue, so no
backport is needed.
The commit e5e36ce09 ("BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work
with applet's buffers") fixed the TCP applets API to work with applets using
its own buffers. Howver the getline() function was not updated. It could be
an issue for anyone registering a CLI commands reading lines.
This patch should be backported as far as 3.0.
The internal function responsible to receive data for TCP applets with
internal buffers is buggy. Indeed, for these applets, the buffer API is used
to get data. So there is no tests on the SE to properly detect connection
shutdowns. So, it must be performed by hand after the call to b_getblk_nc().
This patch must be backported as far as 3.0.
The commit 03dc54d802 ("BUG/MINOR: ring: Fix I/O handler of "show event"
command to not rely on the SC") introduced a regression. By removing
dependencies on the SC, a test to detect client shutdowns was removed. So
now, the CLI applet is no longer released when the client shut the
connection during a "show event -w".
So of course, we should not use the SC to detect the shutdowns. But the SE
must be used insteead.
It is a 3.2-specific issue, so no backport needed.
It is now possile to set a label on a bind line. All sockets attached to
this bind line inherits from this label. The idea is to be able to groud of
sockets. For now, there is no mechanism to create these groups, this must be
done by hand.
Vtest2, that should replaced Vtest in few months, will reject any failing
commands in shell blocks. However, some scripts are executing some commands,
expecting an error to be able to parse the error output. So, now use "set
+e" in those scripts to explicitly state failing commads are expected.
It is just used for non-final commands. At the end, the shell block must
still report a success.
VTest2 will replaced VTest in few months. There is not so much change
expected. One of them is that a User-Agent header is added by default in all
requests, except if an custom one is already set or if "-nouseragent" option
is used. To still be compatible with VTest, it is not possible to use the
option to avoid the header addition. So, a custom user-agent is added in the
last test of "sample_fetches/cond_set_var.vtc" to be sure it will pass with
Vtest and Vtest2. It is mandatory because the request length is tested.
This is essentially in order to address the concerns expressed in
issue #2226 where it is mentioned that the moment they are called is
not clear enough. Admittedly, re-reading the paragraph doesn't make
it obvious on a quick read that they behave like functions. This patch
adds an extra paragraph that makes the parallel with programming
languages' boolean functions and explains the fact that they can be
multi-valued. Hoping this is clearer now.
Issue #2267 suggests that it's unclear what exactly the byte counts mean
(particularly when compression is involved). Let's clarify that the counts
are read on data input and that they also cover headers and a bit of
internal overhead.
Since commit 2c3d656f8 ("MEDIUM: h3: use absolute URI form with
:authority"), the absolute URI form is used when a ':authority'
pseudo-header is found. However, this URI was not declared as normalized
internally. So, when the request is reformated to be sent to an h1 server,
the absolute-form is used instead of the origin-form. It is unexpected and
may be an issue for some servers that could reject the request.
So, now, we take care to set HTX_SL_F_HAS_AUTHORITY flag on the HTX message
when an authority was found and HTX_SL_F_NORMALIZED_URI flag is set for
"http" or "https" schemes.
No backport needed because the commit above must not be backported. It
should fix a regression reported on the 3.2-dev17 in issue #2977.
This commit depends on "BUG/MINOR: h3: Set HTX flags corresponding to the
scheme found in the request".
When a ":scheme" pseudo-header is found in a h3 request, the
HTX_SL_F_HAS_SCHM flag must be set on the HTX message. And if the scheme is
'http' or 'https', the corresponding HTX flag must also be set. So,
respectively, HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS.
It is mainly used to send the right ":scheme" pseudo-header value to H2
server on backend side.
This patch could be backported as far as 2.6.
As reported in issue #2195, cookie captures and header captures are no
longer the recommended way to proceed. Let's mention that this is the
legacy way and provide a few pointers to the recommended functions and
actions to use the modern methods.
As reported in issue #2327, the wording used in the section about quoting
can be read two ways due to the use of the two types of quotes to protect
each other quote. Better only use the quoting without mixing the two when
mentioning them.
Fix a bad example in the crt-store section. site1 does not use the "web"
crt-store but the global one.
Must be backported as far as 3.0 however the section was 3.12 in
previous version.
When "vary" is enabled, we can have multiple entries for a given primary
key in the cache tree. There is a limit to how many secondary entries
can be inserted for a given key. When we try to insert a new secondary
entry, if the limit is already reached, we can try to find expired
entries with the same primary key, and if the limit is still reached we
want to abort the current insertion and to remove the node that was just
inserted.
In commit "a29b073: MEDIUM: cache: Add refcount on cache_entry" though,
a regression was introduced. Instead of removing the entry just inserted
as the comments suggested, we removed the second to last entry and
returned NULL. We then reset the eb.key of the cache_entry in the caller
because we assumed that the entry was already removed from the tree.
This means that some entries with an empty key were wrongly kept in the
tree and the last secondary entry, which keeps the number of secondary
entries of a given key was removed.
This ended up causing some crashes later on when we tried to iterate
over the elements of this given key. The crash could occur in multiple
places, either when trying to retrieve an entry or to add some new ones.
This crash was raised in GitHub issue #2950.
The fix should be backported up to 3.0.
A valid build warning was reported in the CI with latest commit b40ce97ecc
("BUG/MEDIUM: server: fix crash after duplicate GUID insertion"). Indeed,
if the first test in the function fails, we branch to the err label
with guid==NULL and will crash there. Let's just test guid before
dereferencing it for freeing.
This needs to be backported to 3.0 as well since the commit above was
meant to go there.
On "add server", if a GUID is defined, guid_insert() is used to add the
entry into the global GUID tree. If a similar entry already exists, GUID
insertion fails and the server creation is eventually aborted.
A crash could occur in this case because of an invalid memory access via
guid_remove(). The latter is caused via free_server() as the server
insertion is rejected. The invalid occurs on GUID key.
The issue occurs because of guid_insert(). The function properly
deallocates the GUID key on duplicate insertion, but it failed to reset
<guid.node.key> to NULL. This caused the invalid memory access on
guid_remove(). To fix this, ensure that key member is properly resetted
on guid_insert() error path.
This must be backported up to 3.0.
Implement stress mode on "add server help". This ensures that the
command is fully reentrant on full output buffer.
For testing, it requires compilation with USE_STRESS and global setting
"stress-level 1".
Implement "help" as a sub-command for "add server" CLI. The objective is
to list all the keywords that are supported for dynamic servers. CLI IO
handler and add_srv_ctx are used to support reentrancy on full output
buffer.
Now that this command is implemented, the outdated keyword list on "add
server" from management documentation can be removed.
Extend "add server" to support an IO handler function named
cli_io_handler_add_server(). A context object is also defined whose
usage will depend on IO handler capabilities.
IO handler is skipped when "add server" is run in default mode, i.e. on
a dynamic server creation. Thus, currently IO handler is unneeded.
However, it will become useful to support sub-commands for "add server".
Note that return value of "add server" parser has been changed on server
creation success. Previously, it was used incorrectly to report if
server was inserted or not. In fact, parser return value is used by CLI
generic code to detect if command processing has been completed, or
should continue to the IO handler. Now, "add server" always returns 1 to
signal that CLI processing is completed. This is necessary to preserve
CLI output emitted by parser, even now that IO handler is defined for
the command. Previously, output was emitted in every situations due to
IO handler not defined. See below code snippet from cli.c for a better
overview :
if (kw->parse && kw->parse(args, payload, appctx, kw->private) != 0) {
ret = 1;
goto fail;
}
/* kw->parse could set its own io_handler or io_release handler */
if (!appctx->cli_ctx.io_handler) {
ret = 1;
goto fail;
}
appctx->st0 = CLI_ST_CALLBACK;
ret = 1;
goto end;
Currently there is "no-tls-tickets" that is also supported in the
ssl-default-bind-options directive, but there's no way to re-enable
them on a specific "bind" line. This patch simply provides the option
to re-enable them. Note that the flag is inverted because tickets are
enabled by default and the no-tls-ticket option sets the flag to
disable them.
Several users already reported that it would be nice to support
strict-sni in ssl-default-bind-options. However, in order to support
it, we also need an option to disable it.
This patch moves the setting of the option from the strict_sni field
to a flag in the ssl_options field so that it can be inherited from
the default bind options, and adds a new "no-strict-sni" directive to
allow to disable it on a specific "bind" line.
The test file "del_ssl_crt-list.vtc" which already tests both options
was updated to make use of the default option and the no- variant to
confirm everything continues to work.
In the Prometheus exporter, the last health check status is already exposed,
with its code and duration in seconds. The server status is also exposed.
But the information about the agent check are not available. It is not
really handy because when a server status is changed because of the agent,
it is not obvious by looking to the Prometheus metrics. Indeed, the server
may reported as DOWN for instance, while the health check status still
reports a success. Being able to get the agent status in that case could be
valuable.
So now, the last agent check status is exposed, with its code and duration
in seconds. Following metrics can be grabbe now:
* haproxy_server_agent_status
* haproxy_server_agent_code
* haproxy_server_agent_duration_seconds
Note that unlike the other metrics, no per-backend aggregated metric is
exposed.
This patch is related to issue #2983.
Released version 3.2-dev17 with the following main changes :
- DOC: configuration: explicit multi-choice on bind shards option
- BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers
- BUG/MEDIUM: peers: also limit the number of incoming updates
- MEDIUM: hlua: Add function to change the body length of an HTTP Message
- BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload
- BUG/MINOR: h3: don't insert more than one Host header
- BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field
- DOC: config: properly index "table and "stick-table" in their section
- DOC: management: change reference to configuration manual
- BUILD: debug: mark ha_crash_now() as attribute(noreturn)
- IMPORT: slz: avoid multiple shifts on 64-bits
- IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested
- IMPORT: slz: use a better hash for machines with a fast multiply
- IMPORT: slz: fix header used for empty zlib message
- IMPORT: slz: silence a build warning on non-x86 non-arm
- BUG/MAJOR: leastconn: do not loop forever when facing saturated servers
- BUG/MAJOR: queue: properly keep count of the queue length
- BUG/MINOR: quic: fix crash on quic_conn alloc failure
- BUG/MAJOR: leastconn: never reuse the node after dropping the lock
- MINOR: acme: renewal notification over the dpapi sink
- CLEANUP: quic: Useless BIO_METHOD initialization
- MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures
- MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed)
- MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API
- MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset
- MINOR: quic: OpenSSL 3.5 trick to support 0-RTT
- DOC: update INSTALL for QUIC with OpenSSL 3.5 usages
- DOC: management: update 'acme status'
- BUG/MEDIUM: wdt: always ignore the first watchdog wakeup
- CLEANUP: wdt: clarify the comments on the common exit path
- BUILD: ssl: avoid possible printf format warning in traces
- BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t
- DOC: management: precise some of the fields of "show servers conn"
- BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error
- DOC: watchdog: update the doc to reflect the recent changes
- BUG/MEDIUM: acme: check if acme domains are configured
- BUG/MINOR: acme: fix formatting issue in error and logs
- EXAMPLES: lua: avoid screen refresh effect in "trisdemo"
- CLEANUP: quic: remove unused cbuf module
- MINOR: quic: move function to check stream type in utils
- MINOR: quic: refactor handling of streams after MUX release
- MINOR: quic: add some missing includes
- MINOR: quic: adjust quic_conn-t.h include list
- CLEANUP: cfgparse: alphabetically sort the global keywords
- MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage"
It was mentioned during the development of glitches that it would be
nice to support not killing misbehaving connections below a certain
CPU usage so that poor implementations that routinely misbehave without
impact are not killed. This is now possible by setting a CPU usage
threshold under which we don't kill them via this parameter. It defaults
to zero so that we continue to kill them by default.
Adjust include list in quic_conn-t.h. This file is included in many QUIC
source, so it is useful to keep as lightweight as possible. Note that
connection/QUIC MUX are transformed into forward declaration for better
layer separation.
Insert some missing includes statement in QUIC source files. This was
detected after the next commit which adjust the include list used in
quic_conn-t.h file.
quic-conn layer has to handle itself STREAM frames after MUX release. If
the stream was already seen, it is probably only a retransmitted frame
which can be safely ignored. For other streams, an active closure may be
needed.
Thus it's necessary that quic-conn layer knows the highest stream ID
already handled by the MUX after its release. Previously, this was done
via <nb_streams> member array in quic-conn structure.
Refactor this by replacing <nb_streams> by two members called
<stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for
quic-conn layer to monitor locally opened uni streams, as the peer
cannot by definition emit a STREAM frame on it. Also, bidirectional
streams are always opened by the remote side.
Previously, <nb_streams> were set by quic-stream layer. Now,
<stream_max_uni>/<stream_max_bidi> members are only set one time, just
prior to QUIC MUX release. This is sufficient as quic-conn do not use
them if the MUX is available.
Note that previously, IDs were used relatively to their type, thus
incremented by 1, after shifting the original value. For simplification,
use the plain stream ID, which is incremented by 4.
Move general function to check if a stream is uni or bidirectional from
QUIC MUX to quic_utils module. This should prevent unnecessary include
of QUIC MUX header file in other sources.
In current version of the game, there is a "screen refresh" effect: the
screen is cleared before being re-drawn.
I moved the clear right after the connection is opened and removed it
from rendering time.
Stop emitting \n in errmsg for intermediate error messages, this was
emitting multiline logs and was returning to a new line in the middle of
sentences.
We don't need to emit them in acme_start_task() since the errmsg is
ouput in a send_log which already contains a \n or on the CLI which
also emits it.
When starting the ACME task with a ckch_conf which does not contain the
domains, the ACME task would segfault because it will try to dereference
a NULL in this case.
The patch fix the issue by emitting a warning when no domains are
configured. It's not done at configuration parsing because it is not
easy to emit the warning because there are is no callback system which
give access to the whole ckch_conf once a line is parsed.
No backport needed.
RX buffer allocation has been reworked in current dev tree. The
objective is to support multiple buffers per QCS to improve upload
throughput.
RX buffer allocation failure is handled simply : the whole connection is
closed. This is done via qcc_set_error(), with INTERNAL_ERROR as error
code. This function contains a BUG_ON() to ensure it is called only one
time per connection instance.
On RX buffer alloc failure, the aformentioned BUG_ON() crashes due to a
double invokation of qcc_set_error(). First by qcs_get_rxbuf(), and
immediately after it by qcc_recv(), which is the caller of the previous
one. This regression was introduced by the following commit.
60f64449fb
MAJOR: mux-quic: support multiple QCS RX buffers
To fix this, simply remove qcc_set_error() invocation in
qcs_get_rxbuf(). On buffer alloc failture, qcc_recv() is responsible to
set the error.
This does not need to be backported.
As reported in issue #2970, the output of "show servers conn" is not
clear. It was essentially meant as a debugging tool during some changes
to idle connections management, but if some users want to monitor or
graph them, more info is needed. The doc mentions the currently known
list of fields, and reminds that this output is not meant to be stable
over time, but as long as it does not change, it can provide some useful
metrics to some users.
The build failed on mips32 with a 64-bit time_t here:
https://github.com/haproxy/haproxy/actions/runs/15150389164/job/42595310111
Let's just turn the "remain" variable used to show the remaining time
into a more portable ullong and use %llu for all format specifiers,
since long remains limited to 32-bit on 32-bit archs.
No backport needed.
When building on MIPS-32 with gcc-9.5 and glibc-2.31, I got this:
src/ssl_trace.c: In function 'ssl_trace':
src/ssl_trace.c:118:42: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'ssize_t' {aka 'const int'} [-Wformat=]
118 | chunk_appendf(&trace_buf, " : size=%ld", *size);
| ~~^ ~~~~~
| | |
| | ssize_t {aka const int}
| long int
| %d
Let's just cast the type. No backport needed.
With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread
report its own warnings"), when the TH_FL_STUCK flag was flipped on,
we'd then go to the panic code instead of giving a second chance like
before the commit. This can trigger rare cases that only happen with
moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM:
wdt: fix the stuck detection for warnings"). This is in fact due to
the loss of the common "goto update_and_leave" that used to serve
both the warning code and the flag setting for probation, and it's
apparently what hit Christian in issue #2980.
Let's make sure we exit naturally when turning the bit on for the
first time. Let's also update the confusing comment at the end of
the check that was left over by latest change.
Since the first commit was backported to 3.1, this commit should be
backported there as well.
For an unidentified reason, SSL_do_hanshake() succeeds at its first call when 0-RTT
is enabled for the connection. This behavior looks very similar by the one encountered
by AWS-LC stack. That said, it was documented by AWS-LC. This issue leads the
connection to stop sending handshake packets after having release the handshake
encryption level. In fact, no handshake packets could even been sent leading
the handshake to always fail.
To fix this, this patch simulates a "handshake in progress" state waiting
for the application level read secret to be established by the TLS stack.
This may happen only after the QUIC listener has completed/confirmed the handshake
upon handshake CRYPTO data receipt from the peer.
A QUIC must sent its transport parameter using a TLS custom extention. This
extension is reset by SSL_set_SSL_CTX(). It can be restored calling
quic_ssl_set_tls_cbs() (which calls SSL_set_quic_tls_cbs()).
The quic_conn struct is modified for two reasons. The first one is to store
the encoded version of the local tranport parameter as this is done for
USE_QUIC_OPENSSL_COMPAT. Indeed, the local transport parameter "should remain
valid until after the parameters have been sent" as mentionned by
SSL_set_quic_tls_cbs(3) manual. In our case, the buffer is a static buffer
attached to the quic_conn object. qc_ssl_set_quic_transport_params() function
whose role is to call SSL_set_tls_quic_transport_params() (aliased by
SSL_set_quic_transport_params() to set these local tranport parameter into
the TLS stack from the buffer attached to the quic_conn struct.
The second quic_conn struct modification is the addition of the new ->prot_level
(SSL protection level) member added to the quic_conn struct to store "the most
recent write encryption level set via the OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn
callback (if it has been called)" as mentionned by SSL_set_quic_tls_cbs(3) manual.
This patches finally implements the five remaining callacks to make the haproxy
QUIC implementation work.
OSSL_FUNC_SSL_QUIC_TLS_crypto_send_fn() (ha_quic_ossl_crypto_send) is easy to
implement. It calls ha_quic_add_handshake_data() after having converted
qc->prot_level TLS protection level value to the correct ssl_encryption_level_t
(boringSSL API/quictls) value.
OSSL_FUNC_SSL_QUIC_TLS_crypto_recv_rcd_fn() (ha_quic_ossl_crypto_recv_rcd())
provide the non-contiguous addresses to the TLS stack, without releasing
them.
OSSL_FUNC_SSL_QUIC_TLS_crypto_release_rcd_fn() (ha_quic_ossl_crypto_release_rcd())
release these non-contiguous buffer relying on the fact that the list of
encryption level (qc->qel_list) is correctly ordered by SSL protection level
secret establishements order (by the TLS stack).
OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn() (ha_quic_ossl_got_transport_params())
is a simple wrapping function over ha_quic_set_encryption_secrets() which is used
by boringSSL/quictls API.
OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() (ha_quic_ossl_got_transport_params())
role is to store the peer received transport parameters. It simply calls
quic_transport_params_store() and set them into the TLS stack calling
qc_ssl_set_quic_transport_params().
Also add some comments for all the OpenSSL 3.5 QUIC API callbacks.
This patch have no impact on the other use of QUIC API provided by the others TLS
stacks.
This patch allows the use of the new OpenSSL 3.5.0 QUIC TLS API when it is
available and detected at compilation time. The detection relies on the presence of the
OSSL_FUNC_SSL_QUIC_TLS_CRYPTO_SEND macro from openssl-compat.h. Indeed this
macro is defined by OpenSSL since 3.5.0 version. It is not defined by quictls.
This helps in distinguishing these two TLS stacks. When the detection succeeds,
HAVE_OPENSSL_QUIC is also defined by openssl-compat.h. Then, this is this new macro
which is used to detect the availability of the new OpenSSL 3.5.0 QUIC TLS API.
Note that this detection is done only if USE_QUIC_OPENSSL_COMPAT is not asked.
So, USE_QUIC_OPENSSL_COMPAT and HAVE_OPENSSL_QUIC are exclusive.
At the same location, from openssl-compat.h, ssl_encryption_level_t enum is
defined. This enum was defined by quictls and expansively used by the haproxy
QUIC implementation. SSL_set_quic_transport_params() is replaced by
SSL_set_quic_tls_transport_params. SSL_set_quic_early_data_enabled() (quictls) is also replaced
by SSL_set_quic_tls_early_data_enabled() (OpenSSL). SSL_quic_read_level() (quictls)
is not defined by OpenSSL. It is only used by the traces to log the current
TLS stack decryption level (read). A macro makes it return -1 which is an
usused values.
The most of the differences between quictls and OpenSSL QUI APIs are in quic_ssl.c
where some callbacks must be defined for these two APIs. This is why this
patch modifies quic_ssl.c to define an array of OSSL_DISPATCH structs: <ha_quic_dispatch>.
Each element of this arry defines a callback. So, this patch implements these
six callabcks:
- ha_quic_ossl_crypto_send()
- ha_quic_ossl_crypto_recv_rcd()
- ha_quic_ossl_crypto_release_rcd()
- ha_quic_ossl_yield_secret()
- ha_quic_ossl_got_transport_params() and
- ha_quic_ossl_alert().
But at this time, these implementations which must return an int return 0 interpreted
as a failure by the OpenSSL QUIC API, except for ha_quic_ossl_alert() which
is implemented the same was as for quictls. The five remaining functions above
will be implemented by the next patches to come.
ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() have been moved
to be defined for both quictls and OpenSSL QUIC API.
These callbacks are attached to the SSL objects (sessions) calling qc_ssl_set_cbs()
new function. This latter callback the correct function to attached the correct
callbacks to the SSL objects (defined by <ha_quic_method> for quictls, and
<ha_quic_dispatch> for OpenSSL).
The calls to SSL_provide_quic_data() and SSL_process_quic_post_handshake()
have been also disabled. These functions are not defined by OpenSSL QUIC API.
At this time, the functions which call them are still defined when HAVE_OPENSSL_QUIC
is defined.
There were no traces to diagnose qc_ssl_sess_init() failures from QUIC traces.
This patch add calls to TRACE_DEVEL() into qc_ssl_sess_init() and its caller
(qc_alloc_ssl_sock_ctx()). This was useful at least to diagnose SSL context
initialization failures when porting QUIC to the new OpenSSL 3.5 QUIC API.
Should be easily backported as far as 2.6.
This code is there from QUIC implementation start. It was supposed to
initialize <ha_quic_meth> as a BIO_METHOD static object. But this
BIO_METHOD is not used at all!
Should be backported as far as 2.6 to help integrate the next patches to come.
Output a sink message when the certificate was renewed by the ACME
client.
The message is emitted on the "dpapi" sink, and ends by \n\0.
Since the message contains this binary character, the right -0 parameter
must be used when consulting the sink over the CLI:
Example:
$ echo "show events dpapi -nw -0" | socat -t9999 /tmp/haproxy.sock -
<0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0
When used with the master CLI, @@1 should be used instead of @1 in order
to keep the connection to the worker.
Example:
$ echo "@@1 show events dpapi -nw -0" | socat -t9999 /tmp/master.sock -
<0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0
On ARM with 80 cores and a single server, it's sometimes possible to see
a segfault in fwlc_get_next_server() around 600-700k RPS. It seldom
happens as well on x86 with 128 threads with the same config around 1M
rps. It turns out that in fwlc_get_next_server(), before calling
fwlc_srv_reposition(), we have to drop the lock and that one takes it
back again.
The problem is that anything can happen to our node during this time,
and it can be freed. Then when continuing our work, we later iterate
over it and its next to find a node with an acceptable key, and by
doing so we can visit either uninitialized memory or simply nodes that
are no longer in the tree.
A first attempt at fixing this consisted in artificially incrementing
the elements count before dropping the lock, but that turned out to be
even worse because other threads could loop forever on such an element
looking for an entry that does not exist. Maintaining a separate
refcount didn't work well either, and it required to deal with the
memory release while dropping it, which is really not convenient.
Here we're taking a different approach consisting in simply not
trusting this node anymore and going back to the beginning of the
loop, as is done at a few other places as well. This way we can
safely ignore the possibly released node, and the test runs reliably
both on the arm and the x86 platforms mentioned above. No performance
regression was observed either, likely because this operation is quite
rare.
No backport is needed since this appeared with the leastconn rework
in 3.2.
If there is an alloc failure during qc_new_conn(), cleaning is done via
quic_conn_release(). However, since the below commit, an unchecked
dereferencing of <qc.path> is performed in the latter.
e841164a44
MINOR: quic: account for global congestion window
To fix this, simply check <qc.path> before dereferencing it in
quic_conn_release(). This is safe as it is properly initialized to NULL
on qc_new_conn() first stage.
This does not need to be backported.
The queue length was moved to its own variable in commit 583303c48
("MINOR: proxies/servers: Calculate queueslength and use it."), however a
few places were missed in pendconn_unlink() and assign_server_and_queue()
resulting in never decreasing counts on aborted streams. This was
reproduced when injecting more connections than the total backend
could stand in TCP mode and letting some of them time out in the
queue. No backport is needed, this is only 3.2.
Since commit 9fe72bba3 ("MAJOR: leastconn; Revamp the way servers are
ordered."), there's no way to escape the loop visiting the mt_list heads
in fwlc_get_next_server if all servers in the list are saturated,
resulting in a watchdog panic. It can be reproduced with this config
and injecting with more than 2 concurrent conns:
balance leastconn
server s1 127.0.0.1:8000 maxconn 1
server s2 127.0.0.1:8000 maxconn 1
Here we count the number of saturated servers that were encountered, and
escape the loop once the number of remaining servers exceeds the number
of saturated ones. No backport is needed since this arrived in 3.2.
Building with clang 16 on MIPS64 yields this warning:
src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function]
static inline uint32_t crc32_uint32(uint32_t data)
^
Let's guard it using UNALIGNED_LE_OK which is the only case where it's
used. This saves us from introducing a possibly non-portable attribute.
This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.
Calling slz_rfc1950_finish() without emitting any data would result in
incorrectly emitting a gzip header (rfc1952) instead of a zlib header
(rfc1950) due to a copy-paste between the two wrappers. The impact is
almost inexistent since the zlib format is almost never used in this
context, and compressing totally empty messages is quite rare as well.
Let's take this opportunity for fixing another mistake on an RFC number
in a comment.
This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.
The current hash involves 3 simple shifts and additions so that it can
be mapped to a multiply on architecures having a fast multiply. This is
indeed what the compiler does on x86_64. A large range of values was
scanned to try to find more optimal factors on machines supporting such
a fast multiply, and it turned out that new factor 0x1af42f resulted in
smoother hashes that provided on average 0.4% better compression on both
the Silesia corpus and an mbox file composed of very compressible emails
and uncompressible attachments. It's even slightly better than CRC32C
while being faster on Skylake. This patch enables this factor on archs
with a fast multiply.
This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.
If building for sse4 and USE_CRC32C_HASH is defined, then we can use
crc32c to calculate the lookup hash. By default we don't do it because
even on skylake it's slower than the current hash, which only involves
a short multiply (~5% slower). But the gains are marginal (0.3%).
This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850.
Note: this is not used by default and only merged in order to avoid
divergence between the code bases.
On 64-bit platforms, disassembling the code shows that send_huff() performs
a left shift followed by a right one, which are the result of integer
truncation and zero-extension caused solely by using different types at
different levels in the call chain. By making encode24() take a 64-bit
int on input and send_huff() take one optionally, we can remove one shift
in the hot path and gain 1% performance without affecting other platforms.
This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.
Building on MIPS64 with clang16 incorrectly reports some uninitialized
value warnings in stats-proxy.c due to some calls to ABORT_NOW() where
the compiler didn't know the code wouldn't return. Let's properly mark
the function as noreturn, and take this opportunity for also marking it
unused to avoid possible warnings depending on the build options (if
ABORT_NOW is not used). No backport needed though it will not harm.
Since e24b77e7 ('DOC: config: move the extraneous sections out of the
"global" definition') the ACME section of the configuration manual was
move from 3.13 to 12.8.
Change the reference to that section in "acme renew".
Tim reported in issue #2953 that "stick-table" and "table" were not
indexed as keywords. The issue was the indent level. Also let's make
sure to put a box around the "store" arguments as well.
In continuation with 9a05c1f574 ("BUG/MEDIUM: h2/h3: reject some
forbidden chars in :authority before reassembly") and the discussion
in issue #2941, @DemiMarie rightfully suggested that Host should also
be sanitized, because it is sometimes used in concatenation, such as
this:
http-request set-url https://%[req.hdr(host)]%[pathq]
which was proposed as a workaround for h2 upstream servers that require
:authority here:
https://www.mail-archive.com/haproxy@formilux.org/msg43261.html
The current patch then adds the same check for forbidden chars in the
Host header, using the same function as for the patch above, since in
both cases we validate the host:port part of the authority. This way
we won't reconstruct ambiguous URIs by concatenating Host and path.
Just like the patch above, this can be backported afer a period of
observation.
Let's make sure we drop extraneous Host headers after having compared
them. That also works when :authority was already present. This way,
like for h1 and h2, we only keep one copy of it, while still making
sure that Host matches :authority. This way, if a request has both
:authority and Host, only one Host header will be produced (from
:authority). Note that due to the different organization of the code
and wording along the evolving RFCs, here we also check that all
duplicates are identical, while h2 ignores them as per RFC7540, but
this will be re-unified later.
This should be backported to stable versions, at least 2.8, though
thanks to the existing checks the impact is probably nul.
It is especially a problem with Lua filters, but it is important to disable
the 0-copy forwarding if a filter alters the payload, or at least to be able
to disable it. While the filter is registered on the data filtering, it is
not an issue (and it is the common case) because, there is now way to
fast-forward data at all. But it may be an issue if a filter decides to
alter the payload and to unregister from data filtering. In that case, the
0-copy forwarding can be re-enabled in a hardly precdictable state.
To fix the issue, a SC flags was added to do so. The HTTP compression filter
set it and lua filters too if the body length is changed (via
HTTPMessage.set_body_len()).
Note that it is an issue because of a bad design about the HTX. Many info
about the message are stored in the HTX structure itself. It must be
refactored to move several info to the stream-endpoint descriptor. This
should ease modifications at the stream level, from filter or a TCP/HTTP
rules.
This should be backported as far as 3.0. If necessary, it may be backported
on lower versions, as far as 2.6. In that case, it must be reviewed and
adapted.
There was no function for a lua filter to change the body length of an HTTP
Message. But it is mandatory to be able to alter the message payload. It is
not possible update to directly update the message headers because the
internal state of the message must also be updated accordingly.
It is the purpose of HTTPMessage.set_body_len() function. The new body
length myst be passed as argument. If it is an integer, the right
"Content-Length" header is set. If the "chunked" string is used, it forces
the message to be chunked-encoded and in that case the "Transfer-Encoding"
header.
This patch should fix the issue #2837. It could be backported as far as 2.6.
There's a configurable limit to the number of messages sent to a
peer (tune.peers.max-updates-at-once), but this one is not applied to
the receive side. While it can usually be OK with default settings,
setups involving a large tune.bufsize (1MB and above) regularly
experience high latencies and even watchdogs during reloads because
the full learning process sends a lot of data that manages to fill
the entire buffer, and due to the compactness of the protocol, 1MB
of buffer can contain more than 100k updates, meaning taking locks
etc during this time, which is not workable.
Let's make sure the receiving side also respects the max-updates-at-once
setting. For this it counts incoming updates, and refrains from
continuing once the limit is reached. It's a bit tricky to do because
after receiving updates we still have to send ours (and possibly some
ACKs) so we cannot just leave the loop.
This issue was reported on 3.1 but it should progressively be backported
to all versions having the max-updates-at-once option available.
using "send-proxy" or "send-proxy-v2" option on a ring server is not
relevant nor supported. Worse, on 2.4 it causes haproxy process to
crash as reported in GH #2965.
Let's be more explicit about the fact that this keyword is not supported
under "ring" context by ignoring the option and emitting a warning message
to inform the user about that.
Ideally, we should do the same for peers and log servers. The proper way
would be to check servers options during postparsing but we currently lack
proper cross-type server postparsing hooks. This will come later and thus
will give us a chance to perform the compatibilty checks for server
options depending on proxy type. But for now let's simply fix the "ring"
case since it is the only one that's known to cause a crash.
It may be backported to all stable versions.
From the documentation, this wasn't clear enough that shards should
be followed by one of the options number / by-thread / by-group.
Align it with existing options in documentation so that it becomes
more explicit.
2025-05-14 19:41:38 +02:00
430 changed files with 11768 additions and 5907 deletions
/* mostly config or admin stuff, doesn't change often */
/* mostly config or admin stuff, doesn't change often */
@ -347,6 +355,7 @@ struct server {
shortonmarkedup;/* what to do when marked up: one of HANA_ONMARKEDUP_* */
shortonmarkedup;/* what to do when marked up: one of HANA_ONMARKEDUP_* */
intslowstart;/* slowstart time in seconds (ms in the conf) */
intslowstart;/* slowstart time in seconds (ms in the conf) */
intidle_ping;/* MUX idle-ping interval in ms */
intidle_ping;/* MUX idle-ping interval in ms */
unsignedlonglast_change;/* internal use only (not for stats purpose): last time the server state was changed, doesn't change often, not updated atomically on purpose */
char*id;/* just for identification */
char*id;/* just for identification */
uint32_trid;/* revision: if id has been reused for a new server, rid won't match */
uint32_trid;/* revision: if id has been reused for a new server, rid won't match */
@ -422,6 +431,7 @@ struct server {
intpuid;/* proxy-unique server ID, used for SNMP, and "first" LB algo */
intpuid;/* proxy-unique server ID, used for SNMP, and "first" LB algo */
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.